Get the latest tech news
LLMs and the Harry Potter problem
Large language models may have big context windows, but they still aren't good enough at using the information in big contexts, especially in high value use-cases.
For instance, we’ve spent months studying every insurance policy we can get our hands on to develop an ontology that we try to fit the document to, and then built an ingestion and retrieval pipeline around it. Technical reading Lost in the middle: A pretty good review of the fundamental issue with LLMs and long contexts RULER: A new framework proposed by researchers at NVidia to replace the needle in a haystack test. ALiBi and LongRoPE(an extension of the original RoPE design most popularly seen in the T5 models): How we can get larger context windows without training on equivalently large bodies of text Long-context LLMs Struggle with Long In-context Learning
Or read this on Hacker News