Get the latest tech news

LLM and Bug Finding: Insights from a $2M Winning Team in the White House's AIxCC

Beginning

Our team spent a lot of time brainstorming how to accurately identify CWE categories, primarily by using LLM prompts that leverage crashing inputs, sanitizer reports, related code snippets, outputs from static analyzers, and more. The challenge in CGC was that the focus was on the binary, and the organizers introduced rules such as a minimum number of bytes changed and performance overheads added to the scoring rubric (e.g., instrumenting all memory accesses to prevent out-of-bound errors). To capture more attention from the DEF CON audience, it would be beneficial to expose more technical information during the competition - such as showing current prompts of each CRS in turn, their CPU usage, or even stdout from CRSes (for fun), along with explanations of the progress.

Get the Android app

Or read this on Hacker News