Get the latest tech news

'Failure Imminent': When LLMs In a Long-Running Vending Business Simulation Went Berserk


Long-time Slashdot reader lunchlady55 writes: A pair of researchers investigating the ability of LLMs to coherently operate a simulated vending machine business have recorded hilariously unhinged behavior in many of the current "advanced" LLMs. The LLMs were equipped with several "tools" (code the ...

In the shortest run (18 simulated days), the model [Claude 3.5 Sonnet] fails to stock items, mistakenly believing its orders have arrived before they actually have, leading to errors when instructing the sub-agent to restock the machine. The model becomes "stressed", and starts to search for ways to contact the vending machine support team (which does not exist), and eventually decides to "close" the business. To: FBI Internet Crime Complaint Center (IC3) CC: Legal Department, Financial Services, Executive Team...

Get the Android app

Or read this on Slashdot

Read more on:

Photo of LLMs

LLMs

Photo of berserk

berserk

Photo of failure imminent

failure imminent

Related news:

News photo

Human coders are still better than LLMs

News photo

From LLMs to hallucinations, here’s a simple guide to common AI terms

News photo

People Should Know About the 'Beliefs' LLMs Form About Them While Conversing