DeepSWE

Read news on DeepSWE with our app.

DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5, and finds Claude Opus exploiting a benchmark loophole

DeepSWE: A contamination-free benchmark for long-horizon coding agents