Get the latest tech news

Show HN: I benchmarked LLM agents on fixing real-world security vulnerabilities


Benchmarking LLMs on real-world CVE patching

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of llm agents

llm agents

Photo of CVE-Bench

CVE-Bench

Related news:

News photo

Constraint Decay: The Fragility of LLM Agents in Back End Code Generation

News photo

Show HN: ATO – a GUI to see and fix what your LLM agents configured

News photo

Claws are now a new layer on top of LLM agents