Get the latest tech news
Google’s Gemini 2.5 Flash introduces ‘thinking budgets’ that cut AI costs by 600% when turned down
Google's new Gemini 2.5 Flash AI model introduces adjustable "thinking budgets" that let businesses pay only for the reasoning power they need, balancing advanced capabilities with cost efficiency.
This nearly sixfold price difference for reasoned outputs reflects the computational intensity of the “thinking” process, where the model evaluates multiple potential paths and considerations before generating a response. On Humanity’s Last Exam, a rigorous test designed to evaluate reasoning and knowledge, 2.5 Flash scored 12.1%, outperforming Anthropic’s Claude 3.7 Sonnet(8.9%) and DeepSeek R1(8.6%), though falling short of OpenAI’s recently launched o4-mini(14.3%). Industry analysts note that these benchmarks indicate Google is narrowing the performance gap with competitors while maintaining a pricing advantage — a strategy that may resonate with enterprise customers watching their AI budgets.
Or read this on Venture Beat