Get the latest tech news

Show HN: Qwen-2.5-32B is now the best open source OCR model


OCR Benchmark. Contribute to getomni-ai/benchmark development by creating an account on GitHub.

The evaluation dataset and methodologies are all Open Source, and we encourage expanding this benchmark to encompass any additional providers. Note this scoring method heavily penalizes accurate text that does not conform to the exact layout of the ground truth data. Model ProviderModelsOCRJSON ExtractionRequired ENV VariablesAnthropic claude-3-5-sonnet-20241022 ✅✅ ANTHROPIC_API_KEY OpenAI gpt-4o ✅✅ OPENAI_API_KEY Gemini gemini-2.0-flash-001, gemini-1.5-pro, gemini-1.5-flash ✅✅ GOOGLE_GENERATIVE_AI_API_KEY Mistral mistral-ocr ✅❌ MISTRAL_API_KEY OmniAI omniai ✅✅ OMNIAI_API_KEY, OMNIAI_API_URL Model ProviderModelsOCRJSON ExtractionRequired ENV VariablesGemma 3 google/gemma-3-27b-it ✅❌Qwen 2.5 qwen2.5-vl-32b-instruct, qwen2.5-vl-72b-instruct ✅❌Llama 3.2 meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo, meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo ✅❌ZeroX zerox ✅✅ OPENAI_API_KEY Model ProviderModelsOCRJSON ExtractionRequired ENV VariablesAWS aws-text-extract ✅❌ AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION Azure azure-document-intelligence ✅❌ AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT, AZURE_DOCUMENT_INTELLIGENCE_KEY Google google-document-ai ✅❌ GOOGLE_LOCATION, GOOGLE_PROJECT_ID, GOOGLE_PROCESSOR_ID, GOOGLE_APPLICATION_CREDENTIALS_PATH Unstructured unstructured ✅❌ UNSTRUCTURED_API_KEY LLMS are instructed to use the following system prompts for OCR and JSON extraction.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of open source

open source

Photo of OCR model

OCR model

Photo of Qwen-2.5-32B

Qwen-2.5-32B

Related news:

News photo

Open Source Genetic Database Shuts Down To Protect Users From 'Authoritarian Governments'

News photo

‘Open source’ AI isn’t truly open — here’s how researchers can reclaim the term

News photo

Show HN: An Almost Free, Open Source TURN Server