Get the latest tech news

LLMs are still surprisingly bad at some simple tasks


I asked three different commercially available LLMs the same question: Which TLDs have the same name as valid HTML5 elements? This is a pretty simple question to answer. Take two lists and compare them. I know this question is possible to answer because I went through the lists two years ago. Answering the question was a little tedious and subject to my tired human eyes making no mistakes. So…

Answering the question was a little tedious and subject to my tired human eyes making no mistakes. If an intern gave me the same attention to detail as above, we'd be having a cosy little chat about their attitude to work. AI seems plausible because it relies on the Barnum Effect- it tells people what they want to hear.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of LLMs

LLMs

Photo of simple tasks

simple tasks

Related news:

News photo

LLM-Deflate: Extracting LLMs into Datasets

News photo

You're not using LLMs enough

News photo

These psychological tricks can get LLMs to respond to “forbidden” prompts