Repository

<aside>

|██████████|██████████|██████████|██████████|██████████|███░░░░░░░|░░░░░░░|░░░░░░░

</aside>

Paper

understanding research from code >>>>>>(mostly) understanding research from paper

<aside>

9/67: |█████████░|░░░░░░░░░░|░░░░░░░░░░|░░░░░░░░░░|░░░░░░░░░░|░░░░░░░░░░|░░░░░░░|

</aside>

Overview & Main techniques

Major Contributions

Gram Anchoring (training phase): dense feature maps degrading during long training schedules
- overview
post-hoc strategies
fixing features performance gradual decrease in long training (visualized in a patch similarity map)
ViT variant 7B main model, axial RoPE
lastly: high-re post-processing phase & distillation: single teacher multiple students procedure

Related work

previous SSL approaches for vision models:

extracting supervisory signals from parts of an image & predicting other parts
patch re-ordering
inpainting
re-colorization
…

Dataset

Learning objective

Cool techniques

Notes:

contrastive loss (e.g: siamese, infoNCE, …)
implementation to-do’s

Translation differences

to go back to
project notes
files overview

!!!!!! sinkhorn in ibot patch loss all reduce on batch size with no guard