Logprobs
Ollama’s API and OpenAI-compatible API now support log probabilities. Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. This is useful for different use cases:
-
Classification tasks
-
Retrieval (Q&A) evaluation
-
Autocomplete
-
Token highlighting and outputting bytes
-
Calculating perplexity
To enable Logprobs, provide "logprobs": true to Ollama’s API:
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "Why is the sky blue?",
"logprobs": true
}'
When log probabilities are requested, response chunks will now include a "logprobs" field with the token, log probability and raw bytes (for partial unicode).
{
"model": "gemma3",
"created_at": "2025-11-14T22:17:56.598562Z",
"response": "Okay",
"done": false,
"logprobs": [
{
"token": "Okay",
"logprob": -1.3434503078460693,
"bytes": [
79,
107,
97,
121
]
}
]
}
top_logprobs
When setting "top_logprobs", a number of most-likely tokens are also provided, making it possible to introspect alternative tokens. Below is an example request.
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "Why is the sky blue?",
"logprobs": true,
"top_logprobs": 3
}'
This will generate a stream of response chunks with the following fields:
{
"model": "gemma3",
"created_at": "2025-11-14T22:26:10.466324Z",
"response": "The",
"done": false,
"logprobs": [
{
"token": "The",
"logprob": -0.8361086845397949,
"bytes": [
84,
104,
101
],
"top_logprobs": [
{
"token": "The",
"logprob": -0.8361086845397949,
"bytes": [
84,
104,
101
]
},
{
"token": "Okay",
"logprob": -1.2590975761413574,
"bytes": [
79,
107,
97,
121
]
},
{
"token": "That",
"logprob": -1.2686877250671387,
"bytes": [
84,
104,
97,
116
]
}
]
}
]
}
Special thanks
Thank you @baptistejamin for adding Logprobs to Ollama’s API.
Vulkan support (opt-in)
Ollama 0.12.11 includes support for Vulkan acceleration. Vulkan brings support for a broad range of GPUs from AMD, Intel, and iGPUs. Vulkan support is not yet enabled by default, and requires opting in by running Ollama with a custom environment variable:
OLLAMA_VULKAN=1 ollama serve
On Powershell, use:
$env:OLLAMA_VULKAN="1"
ollama serve
For issues or feedback on using Vulkan with Ollama, create an issue labelled Vulkan and make sure to include server logs where possible to aid in debugging.
What’s Changed
-
Ollama’s API and the OpenAI-compatible API now supports Logprobs
-
Ollama’s new app now supports WebP images
-
Improved rendering performance in Ollama’s new app, especially when rendering code
-
The
"required" field in tool definitions will now be omitted if not specified
-
Fixed issue where
"tool_call_id" would be omitted when using the OpenAI-compatible API.
-
Fixed issue where
ollama create would import data from both consolidated.safetensors and other safetensor files.
-
Ollama will now prefer dedicated GPUs over iGPUs when scheduling models
-
Vulkan can now be enabled by setting
OLLAMA_VULKAN=1. For example: OLLAMA_VULKAN=1 ollama serve
New Contributors
Full Changelog: https://github.com/ollama/ollama/compare/v0.12.10…v0.12.11