Get the latest tech news

Popping the GPU Bubble

Photon, Moondream's inference engine, achieves near-realtime VLM inference (~33ms on NVIDIA B200). This is a peek into how it delivers up to 35% higher decode throughput by optimizing how the GPU works.

None

Get the Android app

Or read this on Hacker News

Related news:

Legacy Nvidia RTX 3060 12GB returns to retail five years after original launch, priced at $339 — resurrected GPU strategy that Jensen called a 'good idea' apparently comes to fruition

WebGL Without a GPU

AWS raising GPU instance prices 20% on July 1

« India’s central bank mandated use of .bank domains to enhance trust – but its registry leaked sensitive info

Kyivstar announces plans for AI data center in Ukraine »