Read news on simple tasks with our app.
Read more in the app
LLMs are still surprisingly bad at some simple tasks
Simple tasks showing reasoning breakdown in state-of-the-art LLMs