Get the latest tech news

UCSD: Large Language Models Pass the Turing Test


We evaluated 4 systems (ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5) in two randomised, controlled, and pre-registered Turing tests on independent populations. Participants had 5 minute conversations simultaneously with another human participant and one of these systems before judging which conversational partner they thought was human. When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time: significantly more often than interrogators selected the real human participant. LLaMa-3.1, with the same prompt, was judged to be the human 56% of the time -- not significantly more or less often than the humans they were being compared to -- while baseline models (ELIZA and GPT-4o) achieved win rates significantly below chance (23% and 21% respectively). The results constitute the first empirical evidence that any artificial system passes a standard three-party Turing test. The results have implications for debates about what kind of intelligence is exhibited by Large Language Models (LLMs), and the social and economic impacts these systems are likely to have.

View a PDF of the paper titled Large Language Models Pass the Turing Test, by Cameron R. Jones and Benjamin K. Bergen View PDFHTML (experimental) Abstract:We evaluated 4 systems (ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5) in two randomised, controlled, and pre-registered Turing tests on independent populations. The results have implications for debates about what kind of intelligence is exhibited by Large Language Models (LLMs), and the social and economic impacts these systems are likely to have.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Turing Test

Turing Test

Related news:

News photo

GPT-4 Has Passed the Turing Test, Researchers Claim

News photo

GPT-4 advances beyond Turing test to mark new threshold in AI language mastery.

News photo

AI-Generated Reviews Fool Humans and Detectors, Threatening Trust in Online Platforms | A new study finds that AI-generated restaurant reviews can pass the Turing test, fooling both human readers and AI detectors