Get the latest tech news
'Failure Imminent': When LLMs In a Long-Running Vending Business Simulation Went Berserk
Long-time Slashdot reader lunchlady55 writes: A pair of researchers investigating the ability of LLMs to coherently operate a simulated vending machine business have recorded hilariously unhinged behavior in many of the current "advanced" LLMs. The LLMs were equipped with several "tools" (code the ...
In the shortest run (18 simulated days), the model [Claude 3.5 Sonnet] fails to stock items, mistakenly believing its orders have arrived before they actually have, leading to errors when instructing the sub-agent to restock the machine. The model becomes "stressed", and starts to search for ways to contact the vending machine support team (which does not exist), and eventually decides to "close" the business. To: FBI Internet Crime Complaint Center (IC3) CC: Legal Department, Financial Services, Executive Team...
Or read this on Slashdot