Update Mar 14, 2026 tracked by Updatify

v0.18.0

Ollama 0.18 includes improved performance for OpenClaw and Ollama’s cloud models, including the new Nemotron-3-Super model by NVIDIA designed for high-performance agentic reasoning tasks.

Improved OpenClaw performance with Kimi-K2.5

This release of Ollama improves performance of cloud models and their reliability.

  • Up to 2x faster speeds with Kimi-K2.5
  • Tool calling accuracy has been improved
ollama launch openclaw --model kimi-k2.5

Ollama is now a provider in OpenClaw

Ollama can now be selected as an authentication and model provider during OpenClaw onboarding (thanks @BruceMacD for contributing and @steipete for reviewing!)

openclaw onboard --auth-choice ollama

More information: https://docs.openclaw.ai/providers/ollama

Nemotron-3-Super

Nemotron-3-Super: is a new 122B parameter model with strong reasoning and tool calling capability, while having top performance when run on modern hardware:

  • ollama run nemotron-3-super:cloud
  • ollama run nemotron-3-super to run locally (requires 96GB+ of VRAM)

Nemotron-3-Super scores highest of any open model on PinchBench, a benchmark suite that measures how successful models are at completing tasks when used with OpenClaw.

ollama launch openclaw --model nemotron-3-super:cloud

Or using OpenClaw’s onboarding:

openclaw onboard \
	--auth-choice ollama \
	--custom-model-id nemotron-3-super:cloud

Non-interactive task support

ollama launch now supports non-interactive tasks by passing in --yes. This enables using Claude, Codex, Pi and more in scripts, GitHub Actions, and other non-interactive environments.

ollama launch claude \
	--model glm-5:cloud \
	--yes \
	-- "Do a quick code review of this pull request and respond on GitHub with a comment summarizing your feedback."

Lower latency on MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud

For customers in North America, MiniMax-M2.5 and Qwen3.5 on Ollama’s cloud now respond much faster, up to 10x and up to 2x faster respectively, and often in less than a second. This is ideal for tasks that require a fast Time To First Token (TTFT) when needing quick answers from OpenClaw or quick back-to-back coding tasks.

ollama launch claude --model minimax-m2.5

Driver updates required for ROCm 7

This version of Ollama ships with ROCm 7, and requires updating drivers to the latest version for continued support.

What’s Changed

  • Ollama’s cloud models no longer require downloading via ollama pull. Setting :cloud as a tag will now automatically connect to cloud models.
  • New --yes flag for ollama launch that skips all prompts, making it possible to run AI assistants and other tools in non-interactive environments
  • Fixed issue where “Reset to Defaults” in Ollama’s app would disable downloading automatic updates.
  • Ollama will now ensure context compaction occurs at the correct context length for each model when using ollama launch claude

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.17.7…v0.18.0