Get the latest tech news
Thefastest.ai
Benchmarks for the fastest AI models
Human conversations are fast, typically around 200ms between turns, and we think LLMs should be just as quick. Distributed Footprint: We run our tools daily in multiple data centers using Fly.io. Try 3, Keep 1: For each provider, three separate inferences are performed, and the best result is kept (to remove any outliers due to queuing etc).
Or read this on Hacker News