Update Sep 18, 2025 tracked by Updatify

v0.12.0

Cloud models

Ollama_cloud_background

Cloud models are now available in preview, allowing you to run a group of larger models with fast, datacenter-grade hardware.

To run a cloud model, use:

ollama run qwen3-coder:480b-cloud

What’s Changed

  • Models with the Bert architecture now run on Ollama’s engine
  • Models with the Qwen 3 architecture now run on Ollama’s engine
  • Fix issue where older NVIDIA GPUs would not be detected if newer drivers were installed
  • Fixed issue where models would not be imported correctly with ollama create
  • Ollama will skip parsing the initial <think> if provided in the prompt for /api/generate by @rick-github

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.11.11…v0.12.0