Read news on model scores with our app.
Read more in the app
Top model scores may be skewed by Git history leaks in SWE-bench