Get the latest tech news

Cerebras Inference: AI at Instant Speed


We are excited to announce the release of Cerebras DocChat, our first iteration of models designed for document-based conversational question answering. This series includes two models: Cerebras Llama3-DocChat, a large language model (LLM), and Cerebras Dragon-DocChat, a multi-turn retriever model.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Cerebras

Cerebras

Photo of inference

inference

Photo of instant speed

instant speed

Related news:

News photo

Cerebras gives waferscale chips inferencing twist, claims 1,800 token per sec generation rates

News photo

Google Cloud Run embraces Nvidia GPUs for serverless AI inference

News photo

Groq Raises $640M to Meet Soaring Demand for Fast AI Inference