Get the latest tech news

Teaching GPT-5 to Use a Computer


A copilot for your computer

Unlike previous models that might hallucinate or lose track of the current state, GPT-5's chain-of-thought reasoning allows it to break down "start playing this game" into discrete, executable steps while adapting to unexpected UI changes. We run a 3MB saliency scorer to identify interactive regions (buttons, text fields, links) and grab ~20 patches from those areas while throwing away dead space. Today we want to keep a planner in the loop for rare edge cases and safety; as the executor absorbs those patterns (via streaming, macros, distillation), the system becomes simpler and end-to-end.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of use agents

use agents

Related news:

News photo

Launch HN: Cua (YC X25) – Open-Source Docker Container for Computer-Use Agents

News photo

The rise of browser-use agents: Why Convergence’s Proxy is beating OpenAI’s Operator