Get the latest tech news

Steering interpretable language models with concept algebra

We demonstrate reliable, fine-grained control over language model generation by directly injecting, suppressing, and composing human-interpretable concepts at inference time.

None

Get the Android app

Or read this on Hacker News

Related news:

Hotel's rotary switchboard so retro it predates the concept of crashing

Proof of Concept to Test Humanoid Robots

Patch Cisco ISE bug now before attackers abuse proof-of-concept exploit

« SynthID

Show HN: Hacker Smacker – spot great (and terrible) HN commenters at a glance »