Get the latest tech news

RouteLLM: A framework for serving and evaluating LLM routers

A framework for serving and evaluating LLM routers - save LLM costs without compromising quality! - lm-sys/RouteLLM

Trained routers are provided out of the box, which we have shown to reduce costs by up to 85% on widely-used benchmarks such as MT Bench while maintaining 95% GPT-4 performance. Now, requests with be routed between the strong and weak model depending on what is required, saving costs while maintaining a high quality of responses. Depending on your use case, you might want to consider using a different model pair, modifying the configuration, or calibrating the thresholds based on the types of queries you receive to improve performance.

Get the Android app

Or read this on Hacker News