Get the latest tech news

Lotus: Diffusion-Based Visual Foundation Model for High-Quality Dense Prediction


Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Predictionn

Leveraging the visual priors of pre-trained text-to-image diffusion models offers a promising solution to enhance zero-shot generalization in dense prediction tasks. However, existing methods often uncritically use the original diffusion formulation, which may not be optimal due to the fundamental differences between dense prediction and image generation. Based on these insights, we introduce Lotus, a diffusion-based visual foundation model with a simple yet effective adaptation protocol for dense prediction.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Lotus

Lotus

Photo of quality

quality

Photo of Diffusion

Diffusion

Related news:

News photo

Modded-NanoGPT: NanoGPT (124M) quality in 3.25B tokens

News photo

Qodo raises $40M Series A to bring quality-first code generation and testing to the enterprise

News photo

Troubled Lotus shows off wedge-like vision for an EV sportscar