Get the latest tech news

Show HN: Speeding up LLM inference 2x times (possibly)


A quick preview of the Effort algorithm - more details on https://kolinko.github.io/effort/

While this site doesn't offer GIF conversion at the moment, you can still do it yourself with the help of asciinema GIF generator utility - agg. Once you have it installed run the following command to create GIF file: You can change font family and size, select color theme, adjust speed and more.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of LLM Inference

LLM Inference

Photo of 2x times

2x times

Related news:

News photo

Effort – a possibly new algorithm for LLM Inference