feature/c-radio #3

Merged
isaac merged 25 commits from feature/c-radio into main 2026-05-15 18:10:15 +00:00
Owner
No description provided.
Replace DINOv2 + SigLIP2 two-model pipeline with nvidia/C-RADIOv4-H:
- CRadioExtractor: transformers.AutoModel + CLIPImageProcessor
- 1024-dim features, aspect-preserving resize (not center-crop)
- Streaming centroid: O(1) memory, accumulates running mean
- OOM retry at half batch (min batch 4)
- --resolution default 1024 (paper's optimal zero-shot resolution)
- --auto-curate --top N --dest PATH: copy top N to folder
- Removed ScoringHead (trained on pseudo-scores, no new signal)
- Removed train-head, fine-tune, test-frozen commands
- Removed DINO/SigLIP models, CombinedFeatureExtractor
- 10 tests passing

Net: -800 lines of code
C-RADIOv4-H replaces DINOv2 + SigLIP2 dual-model approach.
- CRadioExtractor: nvidia/C-RADIOv4-H at 1024x1024, 1024-dim output
- Streaming centroid: O(1) memory, arbitrary dataset size
- OOM retry: halves batch until success, no silent drops
- centroid-only scoring (ScoringHead removed — no new signal)
- auto-curate: copy top-N scored images to dest folder
- --resolution flag: 224-2048, default 1024
- 3 auto-fixed review issues incorporated

Co-Authored-By: Claude <noreply@anthropic.com>
C-RADIOv4-H produces a 2560-dim summary (2 teachers × 1280-dim each).
We now slice out[0][:, :1280] to get a single teacher representation
for centroid scoring. Also updates FEATURE_DIM from 1024 to 1280.
isaac merged commit 44a4adef92 into main 2026-05-15 18:10:15 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
isaac/isaac-image-scoring!3
No description provided.