Get the latest tech news

Groq just made Hugging Face way faster — and it’s coming for AWS and Google

Groq challenges AWS and Google with lightning-fast AI inference, exclusive 131k context windows, and new Hugging Face partnership to reach millions of developers.

Groq, the artificial intelligence inference startup, is making an aggressive play to challenge established cloud providers like Amazon Web Services and Google with two major announcements that could reshape how developers access high-performance AI models. Independent benchmarking firm Artificial Analysis measured Groq’s Qwen3 32B deployment running at approximately 535 tokens per second, a speed that would allow real-time processing of lengthy documents or complex reasoning tasks. Amazon’s Bedrock service, for instance, leverages AWS’s massive global cloud infrastructure, while Google’s Vertex AI benefits from the search giant’s worldwide data center network.

Get the Android app

Or read this on Venture Beat