Get the latest tech news

RouteLLM: A framework for serving and evaluating LLM routers


A framework for serving and evaluating LLM routers - save LLM costs without compromising quality! - lm-sys/RouteLLM

Trained routers are provided out of the box, which we have shown to reduce costs by up to 85% on widely-used benchmarks such as MT Bench while maintaining 95% GPT-4 performance. Now, requests with be routed between the strong and weak model depending on what is required, saving costs while maintaining a high quality of responses. Depending on your use case, you might want to consider using a different model pair, modifying the configuration, or calibrating the thresholds based on the types of queries you receive to improve performance.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of LLM

LLM

Photo of Framework

Framework

Photo of LLM routers

LLM routers

Related news:

News photo

NuExtract: A LLM for Structured Extraction

News photo

Mooncake: A KVCache-Centric Disaggregated Architecture for LLM Serving

News photo

Llama-agents: an async-first framework for building production ready agents