Get the latest tech news

Less is more: UC Berkeley and Google unlock LLM potential through simple sampling


With multiple sampling and self-verification, Gemini 1.5 Pro can outperform o1-preview in reasoning tasks.

The core finding is that even a minimalist implementation of sampling-based search, using random sampling and self-verification, can elevate the reasoning performance of models like Gemini 1.5 Pro beyond that of o1-Preview on popular benchmarks. The current popular method for test-time scaling in LLMs is to train the model through reinforcement learning to generate longer responses with chain-of-thought (CoT) traces. Sampling-based search offers a simpler and highly scalable alternative to test-time scaling: Let the model generate multiple responses and select the best one through a verification mechanism.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Google

Google

Photo of uc berkeley

uc berkeley

Photo of simple sampling

simple sampling

Related news:

News photo

Low RAM on the Pixel 9a pushed Google toward a 'limited' version of Gemini

News photo

Fake Semrush ads used to steal SEO professionals’ Google accounts

News photo

Google says its European 'experiment' shows news is worthless to its ad business