Get the latest tech news
Patronus AI’s Judge-Image wants to keep AI honest — and Etsy is already using it
Patronus AI launches the first multimodal LLM-as-a-Judge for evaluating AI systems that process images, with Etsy already implementing the technology to validate product image captions across its marketplace.
“We tended to see that there was a slighter preference toward egocentricity with GPT-4V, whereas we saw that Gemini was less biased in those ways and had more of an equitable approach to being able to judge different kinds of input-output pairs,” Kannappan explained. This roadmap aligns with what Kannappan describes as the company’s “research vision towards scalable oversight” — developing evaluation mechanisms that can keep pace with increasingly sophisticated AI systems. As businesses race to deploy AI systems that can interpret images, extract text from documents, and generate visual content, the risk of inaccuracies, hallucinations and biases grows.
Or read this on Venture Beat