Real-Time Object Detection Meets DINOv3

arXiv 2025

Shihua Huang^*1 Yongjie Hou^*1,2 Longfei Liu^*1 Xuanlong Yu¹ Xi Shen^†1

¹ Intellindust AI Lab | ² Xiamen University
^* Equal Contribution ^† Corresponding

arXiv Code

Abstract

Benefiting from the simplicity and effectiveness of Dense O2O and MAL, DEIM has become the mainstream training framework for real-time DETRs, significantly outperforming the YOLO series. In this work, we extend it with DINOv3 features, resulting in DEIMv2. DEIMv2 spans eight model sizes from X to Atto, covering GPU, edge, and mobile deployment. For the X, L, M, and S variants, we adopt DINOv3-pretrained / distilled backbones and introduce a Spatial Tuning Adapter (STA), which efficiently converts DINOv3’s single-scale output into multi-scale features and complements strong semantics with fine-grained details to enhance detection. For ultra-lightweight models (Nano, Pico, Femto, and Atto), we employ HGNetv2 with depth and width pruning to meet strict resource budgets. Together with a simplified decoder and an upgraded Dense O2O, this unified design enables DEIMv2 to achieve a superior performance–cost trade-off across diverse scenarios, establishing new state-of-the-art results. Notably, our largest model, DEIMv2-X, achieves 57.8 AP with only 50.3M parameters, surpassing prior X-scale models that require over 60M parameters for just 56.5 AP. On the compact side, DEIMv2-S is the first sub-10M model (9.71M) to exceed the 50 AP milestone on COCO, reaching 50.9 AP. Even the ultra-lightweight DEIMv2-Pico, with just 1.5M parameters, delivers 38.5 AP—matching YOLOv10-Nano (2.3M) with $\sim$50\% fewer parameters. Code and pretrained weights are available at: https://github.com/Intellindust-AI-Lab/DEIMv2

Method

Results

Perf. v.s. Params.

Perf. v.s. FLOPs.

Resources

arXiv

Code

BibTeX

If you find this work useful, please cite:

@article{huang2025deimv2,
  title={Real-Time Object Detection Meets DINOv3},
  author={Huang, Shihua and Hou, Yongjie and Liu, Longfei and Yu, Xuanlong and Shen, Xi},
  journal={arXiv},
  year={2025}
}