Get the latest tech news

You could have designed state of the art positional encoding


fleetwood.dev

We will achieve this by iteratively improving our approach to encoding position, arriving at Ro tary P ostional E ncoding (RoPE) used in the latest LLama 3.2 release and most modern transformers. To reiterate, the self-attention mechanism enables the model to weigh the importance of different elements in an input sequence and dynamically adjust their influence on the output. By artfully applying our rotations to 2D chunks of q\mathbf{q}q and k\mathbf{k}k prior to their dot product, and switching from additive to multiplicative, we can gain a big performance boost in evaluations 4.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of State

State

Photo of Art

Art

Photo of positional encoding

positional encoding

Related news:

News photo

The Art of Manually Editing Hunks

News photo

A Mathematician in a School of Art

News photo

Poland Is Running Out of Time to Stand Up a State-Owned EV Maker