Get the latest tech news

Claude can sometimes prove it


Let me get right to the point without any nonsense about aliens: Interactive theorem proving tools such as Lean are the most powerful and trustworthy kind of formal methods tool. They have been used to formally verify important things such as cryptographic libraries, compilers, and operating systems.

It requires the ability to juggle abstractions as in pen-and-paper mathematics, a willingness to wrangle complex constraints in the exotic languages used by ITP tools, and a tolerance for microscopic pedantry which most people find almost impossible to summon. Proofs of individual theorems are important, but much of the work with Lean involves adjacent tasks such as selecting abstractions, refactoring code, analyzing requirements, fixing failures. Write a TODO list for itself describing how to answer the request Search the local codebase for the file that contains compose_is_assoc Read the surrounding context for the theorem in the file Run lake build <filename> to confirm that the proof actually failing, and review the resulting error messages Find and read the local definitions of the types involved with the theorem Look at the online documentation for Lean types that come from standard libraries Think about what might resolve the issue (entertainingly, ultrathink is a magic keyword that increases the agent’s ‘thinking budget’) Write a response proposing a high-level fix, and then ask the user for permission to edit the file Rerun the build to check the fix worked

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Claude

Claude

Related news:

News photo

Anthropic Denies Federal Agencies Use of Claude for Surveillance Tasks

News photo

Anthropic Refuses Federal Agencies From Using Claude for Surveillance Tasks

News photo

How people actually use ChatGPT vs Claude - and what the differences tell us