Get the latest tech news

AWS brings prompt routing and caching to its Bedrock LLM service

As businesses move from trying out generative AI in limited prototypes to putting them into production, they are becoming increasingly price conscious.

So basically, you want to create this notion of ‘Hey, at run time, based on the incoming prompt, send the right query to the right model,’” Deo explained. Startups like Martian and a number of open source projects also tackle this, but AWS would likely argue that what differentiates its offering is that the router can intelligently direct queries without a lot of human input. Since those customers are asking the company to support these, AWS is launching a marketplace for these models, where the only major difference is that users will have to provision and manage the capacity of their infrastructure themselves — something that Bedrock typically handles automatically.

Get the Android app

Or read this on TechCrunch