Read news on faster generation with our app.
Read more in the app
DSpark: Speculative decoding accelerates LLM inference [pdf]