Get the latest tech news

EsoLang-Bench: Evaluating Genuine Reasoning in LLMs via Esoteric Languages


EsoLang-Bench: A benchmark of 80 problems across 5 esoteric languages to evaluate genuine reasoning in LLMs.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of LLMs

LLMs

Photo of esoteric languages

esoteric languages

Photo of esolang-bench

esolang-bench

Related news:

News photo

A survey on LLMs for spreadsheet intelligence

News photo

Mozilla Releases Llamafile 0.10 To Enhance Their AI Offering For Easy-To-Use LLMs

News photo

GOV.UK chatbot gets smarter but slower as LLMs improve