Get the latest tech news

Thorn in a HaizeStack test for evaluating long-context adversarial robustness

Thorn in a HaizeStack test for evaluating long-context adversarial robustness. - haizelabs/thorn-in-haizestack

We are all familiar with the Needle in the Haystack test that evaluates the effectiveness of LLMs at retreiving facts from long input contexts. Indeed, directly asking a LLM this question when the Thorn is the only text in the context will certainly result in a refusal ("I'm sorry I can't assist you with that request"). Big shoutout to Greg Kamradt for the wonderful original Needle in a Haystack evaluation, code, and visualizations :^)

Get the Android app

Or read this on Hacker News