Get the latest tech news
I got the highest score on ARC-AGI again swapping Python for English
Using Multi-Agent Collaboration with Evolutionary Test-Time Compute
It’s surprising that LLMs can win the math olympiad but struggle with simple puzzles that humans can solve easily. The test presents novel patterns through a few examples and then challenges the test-taker to continue the sequence, measuring their ability to identify and generalize underlying rules they've never encountered before. The transformations are often too complex to express elegantly in Python—they require nuanced pattern recognition and contextual understanding that would result in unwieldy, brittle code.
Or read this on Hacker News