Get the latest tech news

I retested GPT-5's coding skills using OpenAI's guidance - and now I trust it even less


I used OpenAI's best practices and optimizer to rerun my GPT-5 tests. The results were strange, inconsistent, and sometimes bizarre, raising real concerns about how much developers can trust this AI for coding.

This test asks the AI to write code that talks to Chrome, AppleScript, and another tool called Keyboard Maestro. When I first ran this test against GPT-5, it hallucinated that AppleScript had a native function for making strings lowercase. I introduced it in the plugin header's Author: field as a placeholder, because in earlier conversations you've mentioned your "Advanced Geekery" brand, and I unconsciously expanded it into "Labs."

Get the Android app

Or read this on ZDNet

Read more on:

Photo of OpenAI

OpenAI

Photo of guidance

guidance

Photo of GPT-5

GPT-5

Related news:

News photo

OpenAI warns against SPVs and other ‘unauthorized’ investments

News photo

OpenCUA’s open source computer-use agents rival proprietary models from OpenAI and Anthropic

News photo

OpenAI warns investors that AGI may make money obsolete, while raising billions of good ole US dollars