Get the latest tech news

Show HN: Llama 3.3 70B Sparse Autoencoders with API access


Our mission is to advance humanity's understanding of AI by examining the inner workings of advanced AI models (or 'AI Interpretability')

Interestingly, many features related to special formatting tokens or repetitive elements of chat data (such as the knowledge cutoff date) appear as isolated points or small clusters away from the central component. The first is that although the model-written evaluation increases dramatically at around 0.5 (which is where the style fully shifts), at a strength of 0.4 the steered model actually begins to exhibit a few elements of pirate speech. Whether this is because of a fundamental limitation of the architecture (e.g. truly nonlinear features) or some more mundane cause is not yet known, although early evidence points towards incomplete reconstruction not being a simple matter of scale

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Llama

Llama

Photo of API access

API access

Photo of autoencoders

autoencoders

Related news:

News photo

Meta unveils a new, more efficient Llama model

News photo

What happens if we remove 50 percent of Llama?

News photo

Ai2 releases new language models competitive with Meta’s Llama