Get the latest tech news

Self-Supervised Learning from Images with JEPA (2023)


This paper demonstrates an approach for learning highly semantic image representations without relying on hand-crafted data-augmentations. We introduce the Image-based Joint-Embedding Predictive Architecture (I-JEPA), a non-generative approach for self-supervised learning from images. The idea behind I-JEPA is simple: from a single context block, predict the representations of various target blocks in the same image. A core design choice to guide I-JEPA towards producing semantic representations is the masking strategy; specifically, it is crucial to (a) sample target blocks with sufficiently large scale (semantic), and to (b) use a sufficiently informative (spatially distributed) context block. Empirically, when combined with Vision Transformers, we find I-JEPA to be highly scalable. For instance, we train a ViT-Huge/14 on ImageNet using 16 A100 GPUs in under 72 hours to achieve strong downstream performance across a wide range of tasks, from linear classification to object counting and depth prediction.

View a PDF of the paper titled Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture, by Mahmoud Assran and 7 other authors View PDF Abstract:This paper demonstrates an approach for learning highly semantic image representations without relying on hand-crafted data-augmentations. For instance, we train a ViT-Huge/14 on ImageNet using 16 A100 GPUs in under 72 hours to achieve strong downstream performance across a wide range of tasks, from linear classification to object counting and depth prediction.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of images

images

Photo of self

self

Photo of supervised learning

supervised learning

Related news:

News photo

Using uv and PEP 723 for Self-Contained Python Scripts

News photo

StarVector: Generating Scalable Vector Graphics Code from Images and Text

News photo

"Self-densified" wood could give metal a run for its money thanks to a new self-densifying technique for creating super-strong wood.