Update Sep 18, 2025 tracked by Updatify
v0.12.0
Cloud models
Cloud models are now available in preview, allowing you to run a group of larger models with fast, datacenter-grade hardware.
To run a cloud model, use:
ollama run qwen3-coder:480b-cloud
What’s Changed
- Models with the Bert architecture now run on Ollama’s engine
- Models with the Qwen 3 architecture now run on Ollama’s engine
- Fix issue where older NVIDIA GPUs would not be detected if newer drivers were installed
-
Fixed issue where models would not be imported correctly with
ollama create -
Ollama will skip parsing the initial
<think>if provided in the prompt for /api/generate by @rick-github
New Contributors
- @egyptianbman made their first contribution in https://github.com/ollama/ollama/pull/12300
- @russcoss made their first contribution in https://github.com/ollama/ollama/pull/12280
Full Changelog: https://github.com/ollama/ollama/compare/v0.11.11…v0.12.0