Get the latest tech news

PaliGemma 2: Powerful Vision-Language Models, Simple Fine-Tuning


Explore PaliGemma 2, which offers scalable performance with multiple model sizes and resolutions, and is designed as a drop-in replacement for existing PaliGemma users.

This past May, we launched PaliGemma, the first vision-language model in the Gemma family, taking a significant step toward making class-leading visual AI more accessible. It's designed as a drop-in replacement, offering a range of model sizes with immediate performance gains on most tasks without major code modifications. Early innovations using PaliGemma, such as ColPali's advancements in visual document retrieval, RoboFlow's fine-tuning techniques, and progress in real-time object tracking, demonstrate the expanding potential of the Gemmaverse.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of language models

language models

Photo of tuning

tuning

Photo of PaliGemma

PaliGemma

Related news:

News photo

CleaR: Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Labels

News photo

LoRA vs. Full Fine-Tuning: An Illusion of Equivalence

News photo

AMD Open-Source 1B OLMo Language Models