Update Mar 2, 2026 tracked by Updatify

v0.17.5

New models

  • Qwen3.5: the small Qwen 3.5 model series is now available in 0.8B, 2B, 4B and 9B parameter sizes.

What’s Changed

  • Fixed crash in Qwen 3.5 models when split over GPU & CPU
  • Fixed issue where Qwen 3.5 models would repeat themselves due to no presence penalty (note: you may have to redownload the qwen3.5 models: ollama pull qwen3.5:35b for example)
  • ollama run --verbose will now show peak memory usage when using Ollama’s MLX engine
  • Fixed memory issues and crashes in MLX runner
  • Fixed issue where Ollama would not be able to run models imported from Qwen3.5 GGUF files

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.4…v0.17.5