Get the latest tech news
Faster convergence for diffusion models
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Specifically, we introduce REPresentation Alignment (REPA), a simple regularization technique built on recent diffusion transformer architectures. This, in turn, allows the later layers of the diffusion transformers to focus on capturing high-frequency details based on the aligned representations, further improving generation performance. We also examine the scalability of REPA by varying pretrained encoders and diffusion transformer model sizes, showing that aligning with better visual representations leads to improved generation and linear probing results.
Or read this on Hacker News