Updatify / Ollama | Release notes

Create your changelog

Update Dec 4, 2025 tracked by Updatify

v0.13.2

New models

Qwen3-Next: The first installment in the Qwen3-Next series with strong performance in terms of both parameter efficiency and inference speed.

What’s Changed

Flash attention is now enabled by default for vision models such as mistral-3, gemma3, qwen3-vl and more. This improves memory utilization and performance when providing images as input.
Fixed GPU detection on multi-GPU CUDA machines
Fixed issue where deepseek-v3.1 would always think even with thinking is disabled in Ollama’s app

New Contributors

@chengcheng84 made their first contribution in https://github.com/ollama/ollama/pull/13265
@nathan-hook made their first contribution in https://github.com/ollama/ollama/pull/13256

Full Changelog: https://github.com/ollama/ollama/compare/v0.13.1…v0.13.2

Read the original release on Ollama ↗

← All Ollama releases