Get the latest tech news

Probably pay attention to tokenizers


Last week I was helping a friend of mine to get one of his new apps off the ground. I can’t speak much about it at the moment, other than like most apps nowadays it has some AI sprinkled over it. Ok, maybe a bit maybe more just a bit – depends on the way you look at it, I suppose. There is a Retrieval-augmented generation (RAG) hiding somewhere in most of the AI apps. RAG is still all the RAGe – it even has its own Wikipedia page now! I’m not sure if anyone is tracking how fast a term reaches the point where it gets its own Wiki page but RAG must be somewhere near the top of the charts.

The app my friend has been building for the past couple of weeks deals with a lot of e-commerce data: descriptions of different items, invoices, reviews, etc. Another, not that unusual thing and actually, a rather regular occurrence, is users mistyping their queries/reviews i.e. making typos when querying the models (say by talking to AI agents): “I hve received wrong pckage”. Curiously, just by adding empty space characters at the end of the sentence, the distance between the embeddings provided by the OpenAI grows; this is a bit unexpected but bears some consequences on RAG.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of attention

attention

Photo of tokenizers

tokenizers

Related news:

News photo

Big Tech has our attention, but not our trust — For all of Silicon Valley’s huge innovations, two decades of broken promises and monopolistic misbehavior will require more than ChatGPT and faster chips to fix

News photo

The quiet art of attention

News photo

God of War Ragnarok's PSN bypass mod has been pulled because it "got too much attention" and its creator wanted to avoid "potential threats" from Sony