v0.17.5 — Ollama - Product release notes & changelog tool

New models

Qwen3.5: the small Qwen 3.5 model series is now available in 0.8B, 2B, 4B and 9B parameter sizes.

Fixed crash in Qwen 3.5 models when split over GPU & CPU
Fixed issue where Qwen 3.5 models would repeat themselves due to no presence penalty (note: you may have to redownload the qwen3.5 models: ollama pull qwen3.5:35b for example)
ollama run --verbose will now show peak memory usage when using Ollama’s MLX engine
Fixed memory issues and crashes in MLX runner
Fixed issue where Ollama would not be able to run models imported from Qwen3.5 GGUF files

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.4…v0.17.5