Get the latest tech news
Evals will break
he models we have. We're much worse at evaluating the models we're about to build — especially if they cross into a new capability regime.
None
Or read this on Hacker NewsGet the latest tech news
he models we have. We're much worse at evaluating the models we're about to build — especially if they cross into a new capability regime.
None
Or read this on Hacker NewsRead more on:
Related news: