Get the latest tech news

GPT-4o's Memory Breakthrough – Needle in a Needlestack


w benchmark to measure how well LLMs pay attention to the information in their context window. NIAN creates a prompt that includes thousands of limericks and the prompt asks a question about one limerick at a specific location.

Needle in a Needlestack is a new benchmark to measure how well LLMs pay attention to the information in their context window. Even at the beginning of the prompt it could only answer the question correctly 50% of the time. open-mistral-7b 16k tokensopen-mistral-7b 32k tokensRepeating information can make a very big difference on this test.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of needle

needle

Photo of GPT-4o

GPT-4o

Photo of needlestack

needlestack

Related news:

News photo

With OpenAI offering GPT-4o for free, who should be paying for ChatGPT Plus?

News photo

OpenAI’s newest AI model can hold a humanlike conversation | GPT-4o can see, hear and speak with near-instant response times.

News photo

Google showcases a potential answer to GPT-4o ahead of I/O