New models

Nemotron-3-Ultra: NVIDIA Nemotron 3 Ultra is built for high-throughput reasoning and long-running agent workflows.

What’s Changed

Fixed multimodal models not using GPU on the llama.cpp backend can now use Metal GPU offload on Apple Silicon, improving multimodal performance on supported Macs.
ollama create --experimental now respects REQUIRES in Modelfiles for MLX-based models.
ollama launch codex now cleans up old conflicting Codex profile config before launching.
ollama launch pi now migrates users from the legacy Pi package to the official package and preserves the correct npm install prefix.
Pi web search setup now updates only when a newer package is available.
Windows cleanup now terminates the llama.cpp backend more reliably.
Updated the llama.cpp backend.

Full Changelog: https://github.com/ollama/ollama/compare/v0.30.3…v0.30.4