Get the latest tech news

GPT-4o's Memory Breakthrough – Needle in a Needlestack

w benchmark to measure how well LLMs pay attention to the information in their context window. NIAN creates a prompt that includes thousands of limericks and the prompt asks a question about one limerick at a specific location.

Needle in a Needlestack is a new benchmark to measure how well LLMs pay attention to the information in their context window. Even at the beginning of the prompt it could only answer the question correctly 50% of the time. open-mistral-7b 16k tokensopen-mistral-7b 32k tokensRepeating information can make a very big difference on this test.

Get the Android app

Or read this on Hacker News