v0.12.10

`ollama run` now works with embedding models

ollama run can now run embedding models to generate vector embeddings from text:

ollama run embeddinggemma "Hello world"

Content can also be provided to ollama run via standard input:

echo "Hello world" | ollama run embeddinggemma

Fixed errors when running qwen3-vl:235b and qwen3-vl:235b-instruct
Enable flash attention for Vulkan (currently needs to be built from source)
Add Vulkan memory detection for Intel GPU using DXGI+PDH
Ollama will now return tool call IDs from the /api/chat API
Fixed hanging due to CPU discovery
Ollama will now show login instructions when switching to a cloud model in interactive mode
Fix reading stale VRAM data
ollama run now works with embedding models

@ryanycoleman made their first contribution in https://github.com/ollama/ollama/pull/11740
@Rajathbail made their first contribution in https://github.com/ollama/ollama/pull/12929
@virajwad made their first contribution in https://github.com/ollama/ollama/pull/12664
@AXYZdong made their first contribution in https://github.com/ollama/ollama/pull/8601

Full Changelog: https://github.com/ollama/ollama/compare/v0.12.9…v0.12.10