Get the latest tech news
Co-Adapting Human Interfaces and LMs
When the world adapts to LMs, the boundary between agent and environment start to blur 12 Nov 2024 With the explosion of computer use agents like Claude Computer Use and search wars between Perplexity, SearchGPT, and Gemini — it seems inevitable that AI will change the way we access information. So far, we’ve been obsessed with building agents that are better at understanding the world.
LMs have no such priors, so we just end up baking the same inductive biases back into the model, teaching it through lots of data that e.g. “if an<img> icon is to the left of a form field, they’re probably semantically related.” And there are distinct benefits: efficiency (e.g. function calling vs. multimodal pixel control), safety (limiting the set of actions available to LM interfaces), and a wider design space (parallelization without spinning up 500 VMs). I hope that calling in to reschedule my package delivery isn’t something I have to do manually in 10 years, and if so, then maybe we don’t need to keep building around a fundamentally human interface (voice) to solve this problem.
Or read this on Hacker News