Get the latest tech news
Not Diamond automatically routes your query to the best LLM
In the benchmark results shared, Not Diamond's router working with multiple LLMs delivered much better results than individual models.
The San Francisco-based startup has developed a novel LLM router, which allows enterprises to have multiple models in play and direct queries to the best one, improving not only the quality of outputs but also other usage-critical aspects such as overall latency and associated costs. This led him to team up with fellow ML colleagues Tze-Yang Tung and Jeffrey Akiki and launch Not Diamond with the mission of building the infrastructure for intelligently routing queries between models. The CEO did not share the exact count of these early users, but he did confirm that one enterprise customer, Samwell AI, saw a 10% improvement in LLM output quality with a 10% reduction in inference costs and latency with the company’s technology.
Or read this on Venture Beat