Get the latest tech news

PaliGemma 2: Powerful Vision-Language Models, Simple Fine-Tuning

Explore PaliGemma 2, which offers scalable performance with multiple model sizes and resolutions, and is designed as a drop-in replacement for existing PaliGemma users.

This past May, we launched PaliGemma, the first vision-language model in the Gemma family, taking a significant step toward making class-leading visual AI more accessible. It's designed as a drop-in replacement, offering a range of model sizes with immediate performance gains on most tasks without major code modifications. Early innovations using PaliGemma, such as ColPali's advancements in visual document retrieval, RoboFlow's fine-tuning techniques, and progress in real-time object tracking, demonstrate the expanding potential of the Gemmaverse.

Get the Android app

Or read this on Hacker News