Get the latest tech news
Ollama 0.1.32: WizardLM 2, Mixtral 8x22B, macOS CPU/GPU model split
New models WizardLM 2: State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases. wizardlm2:8x22b: large 8x22B...
WizardLM 2: State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases. Mixtral 8x22B: the new leading Mixture of Experts (MoE) base model by Mistral AI. Ollama will now better utilize available VRAM, leading to less out-of-memory errors, as well as better GPU utilization When running larger models that don't fit into VRAM on macOS, Ollama will now split the model between GPU and CPU to maximize performance.
Or read this on Hacker News