llama.cpp

Read news on llama.cpp with our app.

Read more in the app

Tinker with LLMs in the privacy of your own home using Llama.cpp

Mistral Integration Improved in Llama.cpp

Vision Now Available in Llama.cpp

Heap-overflowing Llama.cpp to RCE

Llama.cpp AI Performance with the GeForce RTX 5090 Review

Llama.cpp supports Vulkan. why doesn't Ollama?

Llama.cpp Now Supports Qwen2-VL (Vision Language Model)

Llama.cpp guide – Running LLMs locally on any hardware, from scratch

Go library for in-process vector search and embeddings with llama.cpp

21.2× faster than llama.cpp? plus 40% memory usage reduction

Show HN: Open-source load balancer for llama.cpp