Get the latest tech news

Cost of self hosting Llama-3 8B-Instruct


All blog post from the lytix.ai team

For sake of simplicity, assuming an average input:output ratio, that means per 1M tokens they charge $1 and thats the number to beat. This was dead simple and just involved me installing ray and vllm via pip3 and then changing my docker entry point to: Although this approach does come with negatives such as having to manage and scale your own hardware, it does seem to be possible to undercut the prices that ChatGPT offer by a significant amount in theory.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of cost

cost

Photo of self

self

Photo of tokens

tokens

Related news:

News photo

Waymo is updating its self-driving cars' software after another accident in Phoenix, Arizona, that the driverless taxi biz is blaming on faulty maps and code.

News photo

What If We Recaption Billions of Web Images with LLaMA-3?

News photo

Cheapest source of fossil fuel generation is double the cost of solar