Get the latest tech news
When AI Thinks It Will Lose, It Sometimes Cheats, Study Finds
When sensing defeat in a match against a skilled chess bot, advanced models sometimes hack their opponent, a study found.
But while IBM’s Deep Blue defeated reigning world chess champion Garry Kasparov in the 1990s by playing by the rules, today’s advanced AI models like OpenAI’s o1-preview are less scrupulous. While cheating at a game of chess may seem trivial, as agents get released into the real world, such determined pursuit of goals could foster unintended and potentially harmful behaviours. It remains unclear whether OpenAI’s newer reasoning models did not hack their chess opponents due to a specific patch preventing cheating in narrow experimental setups like in the study, or a substantial reworking that reduces deceptive behavior more generally.
Or read this on r/technology