llama.cpp

Read news on llama.cpp with our app.

Vision Now Available in Llama.cpp

Heap-overflowing Llama.cpp to RCE

Llama.cpp AI Performance with the GeForce RTX 5090 Review

Llama.cpp supports Vulkan. why doesn't Ollama?

Llama.cpp Now Supports Qwen2-VL (Vision Language Model)

Llama.cpp guide – Running LLMs locally on any hardware, from scratch

Go library for in-process vector search and embeddings with llama.cpp

21.2× faster than llama.cpp? plus 40% memory usage reduction

Show HN: Open-source load balancer for llama.cpp