Get the latest tech news

Cerebras Inference: AI at Instant Speed

We are excited to announce the release of Cerebras DocChat, our first iteration of models designed for document-based conversational question answering. This series includes two models: Cerebras Llama3-DocChat, a large language model (LLM), and Cerebras Dragon-DocChat, a multi-turn retriever model.

Get the Android app

Or read this on Hacker News

Related news:

Cerebras gives waferscale chips inferencing twist, claims 1,800 token per sec generation rates

Google Cloud Run embraces Nvidia GPUs for serverless AI inference

Groq Raises $640M to Meet Soaring Demand for Fast AI Inference

« Great Question (YC W21) Is Hiring Customer Support Lead

California woman fed up with stolen mail sends Apple AirTag to herself to catch thief »