Read news on tokens with our app.
Read more in the app
Meta unleashes Llama API running 18x faster than OpenAI: Cerebras partnership delivers 2,600 tokens per second
When AI reasoning goes wrong: Microsoft Research shows more tokens can mean more problems
OpenAI’s new GPT-4.1 models can process a million tokens and solve coding problems better than ever
DeepSeek-V3 Now Runs At 20 Tokens Per Second On Mac Studio
DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
Locks, leases, fencing tokens, FizzBee
Qwen2.5-1M: Deploy your own Qwen with context length up to 1M tokens
Malicious PyPi package steals Discord auth tokens from devs
Meta’s new BLT architecture replaces tokens to make LLMs more efficient and versatile
Byte Latent Transformer: Patches Scale Better Than Tokens
Hyrumtoken: A Go package to encrypt pagination tokens
Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference
Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s
Trump Crypto Project Website Crashes as Its Tokens Go on Sale
Llama 405B 506 tokens/second on an H200
The Role of Anchor Tokens in Self-Attention Networks
Elixir Games allows users to stake tokens directly within its launcher
Donald Trump is hawking tokens for a crypto project he still hasn’t explained
500,000 tokens: How Anthropic’s Claude Enterprise is pushing AI boundaries
Cerebras launches inference for Llama 3.1; benchmarked at 1846 tokens/s on 8B