Get the latest tech news

Patronus AI’s Judge-Image wants to keep AI honest — and Etsy is already using it


Patronus AI launches the first multimodal LLM-as-a-Judge for evaluating AI systems that process images, with Etsy already implementing the technology to validate product image captions across its marketplace.

“We tended to see that there was a slighter preference toward egocentricity with GPT-4V, whereas we saw that Gemini was less biased in those ways and had more of an equitable approach to being able to judge different kinds of input-output pairs,” Kannappan explained. This roadmap aligns with what Kannappan describes as the company’s “research vision towards scalable oversight” — developing evaluation mechanisms that can keep pace with increasingly sophisticated AI systems. As businesses race to deploy AI systems that can interpret images, extract text from documents, and generate visual content, the risk of inaccuracies, hallucinations and biases grows.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Image

Image

Photo of Patronus AI

Patronus AI

Photo of Etsy

Etsy

Related news:

News photo

InstantStyle: Free Lunch Towards Style-Preserving in Text-to-Image Generation

News photo

Image of prototype XB-1 caught breaking the sound barrier

News photo

A former Etsy product manager built an AI-powered app for new parents