Get the latest tech news

Ollama 0.1.32: WizardLM 2, Mixtral 8x22B, macOS CPU/GPU model split

New models WizardLM 2: State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases. wizardlm2:8x22b: large 8x22B...

WizardLM 2: State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases. Mixtral 8x22B: the new leading Mixture of Experts (MoE) base model by Mistral AI. Ollama will now better utilize available VRAM, leading to less out-of-memory errors, as well as better GPU utilization When running larger models that don't fit into VRAM on macOS, Ollama will now split the model between GPU and CPU to maximize performance.

Get the Android app

Or read this on Hacker News