Get the latest tech news

Optimizing AI Inference at Character.ai

At Character.AI, we're building toward AGI. In that future state, large language models (LLMs) will enhance daily life, providing business productivity and entertainment and helping people with everything from education to coaching, support, brainstorming, creative writing and more. To make that a reality globally, it's critical to achieve highly

In that future state, large language models (LLMs) will enhance daily life, providing business productivity and entertainment and helping people with everything from education to coaching, support, brainstorming, creative writing and more. To put this in perspective, this is roughly 20% of the request volume served by Google Search, which processes around 105,000 queries per second according to third party estimates ( Statista, 2024). Taken together, the innovations discussed above achieve unprecedented efficiency and reduce inference costs to a level that makes it far easier to serve LLMs at scale.

Get the Android app

Or read this on Hacker News