Get the latest tech news

PaliGemma: Open-Source Multimodal Model by Google


PaliGemma is a vision language model (VLM) developed and released by Google that has multimodal capabilities. Learn how to use it.

Google’s decision to open source a highly capable multimodal model with the ability to fine-tune on custom data is a major breakthrough for open-source AI. PaliGemma gives you the opportunity to create custom multimodal models which you can self-host in the cloud and potentially on larger edge devices like NVIDIA Jetsons. If you have a unique problem that closed-models have not seen, and will never see due to their proprietary nature, then PaliGemma is a great entry point into building custom AI solutions.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Google

Google

Photo of Source

Source

Photo of multimodal model

multimodal model

Related news:

News photo

Netflix to take on Google and Amazon by building its own ad server

News photo

Google's Wear OS 5 promises better battery life

News photo

Google still hasn’t fixed Gemini’s biased image generator