Get the latest tech news

Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon


Run models too big for your Mac's memory. Contribute to t8/hypura development by creating an account on GitHub.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of nvme

nvme

Photo of GB Mac

GB Mac

Photo of streaming tensors

streaming tensors

Related news:

News photo

Nvidia greenboost: transparently extend GPU VRAM using system RAM/NVMe

News photo

Open-Source "GreenBoost" Driver Aims To Augment NVIDIA GPUs vRAM With System RAM & NVMe To Handle Larger LLMs

News photo

Show HN: Llama 3.1 70B on a single RTX 3090 via NVMe-to-GPU bypassing the CPU