Get the latest tech news

Lotus: Diffusion-Based Visual Foundation Model for High-Quality Dense Prediction

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Predictionn

Leveraging the visual priors of pre-trained text-to-image diffusion models offers a promising solution to enhance zero-shot generalization in dense prediction tasks. However, existing methods often uncritically use the original diffusion formulation, which may not be optimal due to the fundamental differences between dense prediction and image generation. Based on these insights, we introduce Lotus, a diffusion-based visual foundation model with a simple yet effective adaptation protocol for dense prediction.

Get the Android app

Or read this on Hacker News