Get the latest tech news

AI isn’t ready to replace human coders for debugging, researchers say | Even when given access to tools, AI agents can't reliably debug software.

Even when given access to tools, AI agents can’t reliably debug software.

Debug-gym expands an agent’s action and observation space with feedback from tool usage, enabling setting breakpoints, navigating code, printing variable values, and creating test functions. The fixes proposed by a coding agent with debugging capabilities, and then approved by a human programmer, will be grounded in the context of the relevant codebase, program execution and documentation, rather than relying solely on guesses based on previously seen training data. There have been numerous studies already showing that even though an AI tool can sometimes create an application that seems acceptable to the user for a narrow task, the models tend to produce code laden with bugs and security vulnerabilities, and they aren't generally capable of fixing those problems.

Get the Android app

Or read this on r/technology