Ollama

Ollama is one of the two model providers currently available in the Retriqs app.

Use Ollama if you want to run models locally instead of relying on a cloud provider.

Last updated: April 4, 2026

Prerequisites

To use Ollama in Retriqs, you need:

a running Ollama instance
a valid Ollama host
a local chat model
a local embedding model

When to choose Ollama

Ollama is a good fit when you want:

a local setup
more control over where data is processed
local generation
local embeddings

What you can configure

When you choose Ollama in Retriqs, you can configure:

provider
model
API host
context size
max async runners

For embeddings, you can also configure:

embedding model
embedding dimension
token limit

Typical Ollama setup

A common setup looks like this:

LLM provider: Ollama
Embedding provider: Ollama
API host: http://localhost:11434

Typical model examples:

chat: qwen3:0.6B
embeddings: bge-m3:latest

Recommended use

Ollama is the best choice if you want Retriqs to run against local models on your own machine.

It is usually the better option for:

local-first usage
testing without cloud APIs
setups where you do not want to send data to an external provider

Things to watch

Ollama must be running before Retriqs can use it.
The selected model must already exist locally.
The configured host must match your Ollama setup.
Context size should be large enough for your use case.

If the selected Ollama model is not installed locally, indexing or querying will fail.

Related Pages

Start Here