Get the latest tech news

Helix: A vision-language-action model for generalist humanoid control


Figure was founded with the ambition to change the world.

Video 2: Helix allows for fast fine grained motor adjustments, necessary when reacting to a collaborative partner, while carrying out novel semantic goals.Helix's design offers several key advantages over existing approaches: When prompted to "Pick up the desert item", for instance, Helix not only recognizes that a toy cactus matches this abstract concept, but also selects the closest hand and executes the precise motor commands needed to grasp it securely. Helix displays strong object generalization, being able to pick up thousands of novel household items with varying shapes, sizes, colors, and material properties never encountered before in training, simply by asking in natural language.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Helix

Helix

Photo of action model

action model

Related news:

News photo

File Explorer is merged to Helix editor

News photo

Rabbit’s r1 refines chats and timers, but its app-using ‘action model’ is still MIA

News photo

SofleKeyboard – A split keyboard based on Lily58, Crkbd and Helix keyboards