Update May 21, 2025 tracked by Updatify

v0.7.1

What’s Changed

  • Improved model memory management to allocate sufficient memory to prevent crashes when running multimodal models in certain situations
  • Enhanced memory estimation for models to prevent unintended memory offloading
  • ollama show will now show ... when data is truncated
  • Fixed crash that would occur with qwen2.5vl
  • Fixed crash on Nvidia’s CUDA for llama3.2-vision
  • Support for Alibaba’s Qwen 3 and Qwen 2 architectures in Ollama’s new multimodal engine

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.7.0…v0.7.1