Get the latest tech news

I retested GPT-5's coding skills using OpenAI's guidance - and now I trust it even less

I used OpenAI's best practices and optimizer to rerun my GPT-5 tests. The results were strange, inconsistent, and sometimes bizarre, raising real concerns about how much developers can trust this AI for coding.

This test asks the AI to write code that talks to Chrome, AppleScript, and another tool called Keyboard Maestro. When I first ran this test against GPT-5, it hallucinated that AppleScript had a native function for making strings lowercase. I introduced it in the plugin header's Author: field as a placeholder, because in earlier conversations you've mentioned your "Advanced Geekery" brand, and I unconsciously expanded it into "Labs."

Get the Android app

Or read this on ZDNet