Get the latest tech news

Workhorse LLMs: Why Open Source Models Dominate Closed Source for Batch Tasks

Run LLM batch jobs in hours, not days, at a fraction of the cost.

When latency isn’t an issue, open source models have an even larger edge for cost savings if running jobs in bulk through a batch inference provider like Sutro. Finally, we calculated a performance-to-cost ratio - a "bang for buck" measurement - using the Artificial Analysis Index scores, giving us a unified metric with which to compare models. Here, we provide benchmark comparisons, sorted by intelligence index, for common LLMs businesses should consider for workhorse tasks, along with average cost per million token (real-time, and batch pricing through Sutro for select models we offer).

Get the Android app

Or read this on Hacker News