Ollama
Ollama is one of the two model providers currently available in the Retriqs app.
Use Ollama if you want to run models locally instead of relying on a cloud provider.
Last updated: April 4, 2026
Prerequisites
To use Ollama in Retriqs, you need:
- a running Ollama instance
- a valid Ollama host
- a local chat model
- a local embedding model
When to choose Ollama
Ollama is a good fit when you want:
- a local setup
- more control over where data is processed
- local generation
- local embeddings
What you can configure
When you choose Ollama in Retriqs, you can configure:
- provider
- model
- API host
- context size
- max async runners
For embeddings, you can also configure:
- embedding model
- embedding dimension
- token limit
Typical Ollama setup
A common setup looks like this:
- LLM provider:
Ollama - Embedding provider:
Ollama - API host:
http://localhost:11434
Typical model examples:
- chat:
qwen3:0.6B - embeddings:
bge-m3:latest
Recommended use
Ollama is the best choice if you want Retriqs to run against local models on your own machine.
It is usually the better option for:
- local-first usage
- testing without cloud APIs
- setups where you do not want to send data to an external provider
Things to watch
- Ollama must be running before Retriqs can use it.
- The selected model must already exist locally.
- The configured host must match your Ollama setup.
- Context size should be large enough for your use case.
If the selected Ollama model is not installed locally, indexing or querying will fail.