Update Feb 3, 2026 tracked by Updatify

v0.15.5

New models

  • Qwen3-Coder-Next: a coding-focused language model from Alibaba’s Qwen team, optimized for agentic coding workflows and local development.
  • GLM-OCR: GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.

Improvements to ollama launch

  • ollama launch can now be provided arguments, for example ollama launch claude -- --resume
  • ollama launch will now work run subagents when using ollama launch claude
  • Ollama will now set context limits for a set of models when using ollama launch opencode

What’s Changed

  • Sub-agent support for ollama launch for planning, deep research, and similar tasks
  • ollama signin will now open a browser window to make signing in easier
  • Ollama will now default to the following context lengths based on VRAM:
    • < 24 GiB VRAM: 4,096 context
    • 24-48 GiB VRAM: 32,768 context
    • &gt;= 48 GiB VRAM: 262,144 context
  • GLM-4.7-Flash support on Ollama’s experimental MLX engine
  • ollama signin will now open the browser to the connect page
  • Fixed off by one error when using num_predict in the API
  • Fixed issue where tokens from a previous sequence would be returned when hitting num_predict

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.15.4…v0.15.5