Get the latest tech news

Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs


Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs

As AIs rapidly advance and become more agentic, the risk they pose is governed not only by their capabilities but increasingly by their propensities, including goals and values. Surprisingly, we find that independently-sampled preferences in current LLMs exhibit high degrees of structural coherence, and moreover that this emerges with scale. As a case study, we show how aligning utilities with a citizen assembly reduces political biases and generalizes to new scenarios.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of AIs

AIs

Photo of utility engineering

utility engineering

Related news:

News photo

To help AIs understand the world, researchers put them in a robot

News photo

Show HN: Watch 3 AIs compete in real-time stock trading

News photo

Show HN: Chorus, a Mac app that lets you chat with a bunch of AIs at once