CVPR 2026

A Closer Look at Cross-Domain Few-Shot Object Detection: Fine-Tuning Matters and Parallel Decoder Helps

Xuanlong Yu1, Youyang Sha1, Longfei Liu1, Xi Shen1, Di Yang2

1Intellindust AI Lab  ·  2Suzhou Institute for Advanced Research, USTC

CVPR 2026 logo Intellindust AI Lab logo SIAR USTC logo

Motivation

Cross-domain few-shot object detection (CD-FSOD) suffers from unstable optimization and weak generalization due to scarce samples and significant domain shifts. This work revisits fine-tuning of pretrained detectors with lightweight yet effective designs, aiming for stable gains under strict low-shot constraints.

TL;DR

Main Method

Hybrid Ensemble Decoder (HED)

  • Replaces a fully sequential decoder by a hybrid sequential-parallel structure.
  • In parallel stage, object queries inherit previous outputs while denoising queries are partially re-initialized.
  • Increases query diversity and improves detection robustness under domain shift.

Progressive Fine-Tuning

  • Uses plateau-aware learning rate scheduling, without task-specific hyper-parameter search.
  • Freezes the encoder at the beginning, then unfreezes it after the first optimization plateau.
  • Improves training stability and transferability across domains.

Pipeline Dynamic Demo

Experimental Results

Resources

Code: github.com/Intellindust-AI-Lab/FT-FSOD

Intellindust AI Lab: github.com/Intellindust-AI-Lab

Project QR code