Get the latest tech news

Google launches ‘implicit caching’ to make accessing its latest AI models cheaper


Google is rolling out a feature in its Gemini API, implicit caching, that the company claims will make its latest AI models cheaper for third-party devs.

Google is rolling out a feature in its Gemini API that the company claims will make its latest AI models cheaper for third-party developers. Caching, a widely adopted practice in the AI industry, reuses frequently accessed or pre-computed data from models to cut down on computing requirements and cost. The minimum prompt token count for implicit caching is 1,024 for 2.5 Flash and 2,048 for 2.5 Pro, according to Google’s developer documentation, which is not a terribly big amount, meaning it shouldn’t take much to trigger these automatic savings.

Get the Android app

Or read this on TechCrunch

Read more on:

Photo of Google

Google

Photo of Google launches

Google launches

Photo of latest AI models

latest AI models

Related news:

News photo

Google rolls out AI tools to protect Chrome users against scams

News photo

Meta Taps New Head of AI Lab After Staffer’s Return From Google

News photo

Google to back three new nuclear projects