FT-FSOD | CVPR 2026

Motivation

Cross-domain few-shot object detection (CD-FSOD) suffers from unstable optimization and weak generalization due to scarce samples and significant domain shifts. This work revisits fine-tuning of pretrained detectors with lightweight yet effective designs, aiming for stable gains under strict low-shot constraints.

TL;DR

Introduces a practical fine-tuning recipe for CD-FSOD with strong stability.
Proposes a hybrid ensemble decoder to improve query diversity and domain robustness.
Achieves SOTA performance on CD-FSOD, ODinW-13 and RF100-VL few-shot object detection benchmarks.
Provides OOD-focused analysis using a CD-Mixed test set to evaluate confidence behavior of the fine-tuned models.

Main Method

Hybrid Ensemble Decoder (HED)

Replaces a fully sequential decoder by a hybrid sequential-parallel structure.
In parallel stage, object queries inherit previous outputs while denoising queries are partially re-initialized.
Increases query diversity and improves detection robustness under domain shift.

Progressive Fine-Tuning

Uses plateau-aware learning rate scheduling, without task-specific hyper-parameter search.
Freezes the encoder at the beginning, then unfreezes it after the first optimization plateau.
Improves training stability and transferability across domains.

Pipeline Dynamic Demo

Experimental Results

CD-FSOD detailed benchmark table — **CD-FSOD benchmark** across six diverse sub-datasets, including *ArTaxOr, Clipart1k, DIOR, DeepFish, NEU-DET and UODD*.

ODinW-13 benchmark table — **ODinW-13 few-shot benchmark** with 0/1/3/5/10-shot settings.

RF100-VL benchmark table — **RF100-VL few-shot (10-shot) benchmark** with 100 sub-datasets.

Robustness under OOD samples — **OOD robustness analysis** on the proposed **CD-Mixed test set**.

Resources

Code: github.com/Intellindust-AI-Lab/FT-FSOD

Intellindust AI Lab: github.com/Intellindust-AI-Lab