faster generation

Read news on faster generation with our app.

Read more in the app

DSpark: Speculative decoding accelerates LLM inference [pdf]