Get the latest tech news

New fully open source vision encoder OpenVision arrives to improve on OpenAI’s Clip, Google’s SigLIP


A vision encoder is a necessary component for allowing many leading LLMs to be able to work with images uploaded by users.

Ablation studies — where components of a machine learning model are selectively removed to identify their importance or lack thereof to its functioning — further confirm the benefits of this approach, with the largest performance gains observed in high-resolution, detail-sensitive tasks like OCR and chart-based visual question answering. OpenVision’s fully open and modular approach to vision encoder development has strategic implications for enterprise teams working across AI engineering, orchestration, data infrastructure, and security. For engineers overseeing LLM development and deployment, OpenVision offers a plug-and-play solution for integrating high-performing vision capabilities without depending on opaque, third-party APIs or restricted model licenses.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Google

Google

Photo of OpenAI

OpenAI

Photo of Clip

Clip

Related news:

News photo

Google is using Gemini to keep latest online scammers in check

News photo

Google launches new initiative to back startups building AI

News photo

OpenAI's Stargate project struggling to get off the ground, due to tariffs