Get the latest tech news

Alibaba researchers unveil Marco-o1, an LLM with advanced reasoning capabilities

The model uses more cycles during inference to generate more tokens and review responses, improving its performance on reasoning tasks.

Building on the success of o1 and the concept of LRMs, researchers at Alibaba have introduced Marco-o1, which enhances reasoning capabilities and tackles problems with open-ended solutions where clear standards and quantifiable rewards are absent. The researchers also introduced a flexible reasoning action strategy that allows them to adjust the granularity of MCTS steps by defining the number of tokens generated at each node in the tree. To this end, the researchers tested the model on translating colloquial and slang expressions, a task that requires understanding subtle nuances of language, culture and context.

Get the Android app

Or read this on Venture Beat