Update Feb 3, 2026 tracked by Updatify
v0.15.5
New models
- Qwen3-Coder-Next: a coding-focused language model from Alibaba’s Qwen team, optimized for agentic coding workflows and local development.
- GLM-OCR: GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
Improvements to ollama launch
-
ollama launchcan now be provided arguments, for exampleollama launch claude -- --resume -
ollama launchwill now work run subagents when usingollama launch claude -
Ollama will now set context limits for a set of models when using
ollama launch opencode
What’s Changed
-
Sub-agent support for
ollama launchfor planning, deep research, and similar tasks -
ollama signinwill now open a browser window to make signing in easier -
Ollama will now default to the following context lengths based on VRAM:
- < 24 GiB VRAM: 4,096 context
- 24-48 GiB VRAM: 32,768 context
- >= 48 GiB VRAM: 262,144 context
- GLM-4.7-Flash support on Ollama’s experimental MLX engine
-
ollama signinwill now open the browser to the connect page -
Fixed off by one error when using
num_predictin the API -
Fixed issue where tokens from a previous sequence would be returned when hitting
num_predict
New Contributors
- @avukmirovich made their first contribution in https://github.com/ollama/ollama/pull/13934
Full Changelog: https://github.com/ollama/ollama/compare/v0.15.4…v0.15.5