Ollama

Ollama is one of the two model providers currently available in the Retriqs app.

Use Ollama if you want to run models locally instead of relying on a cloud provider.

Last updated: April 4, 2026


Prerequisites

To use Ollama in Retriqs, you need:

  • a running Ollama instance
  • a valid Ollama host
  • a local chat model
  • a local embedding model

When to choose Ollama

Ollama is a good fit when you want:

  • a local setup
  • more control over where data is processed
  • local generation
  • local embeddings

What you can configure

When you choose Ollama in Retriqs, you can configure:

  • provider
  • model
  • API host
  • context size
  • max async runners

For embeddings, you can also configure:

  • embedding model
  • embedding dimension
  • token limit

Typical Ollama setup

A common setup looks like this:

  • LLM provider: Ollama
  • Embedding provider: Ollama
  • API host: http://localhost:11434

Typical model examples:

  • chat: qwen3:0.6B
  • embeddings: bge-m3:latest

Recommended use

Ollama is the best choice if you want Retriqs to run against local models on your own machine.

It is usually the better option for:

  • local-first usage
  • testing without cloud APIs
  • setups where you do not want to send data to an external provider

Things to watch

  • Ollama must be running before Retriqs can use it.
  • The selected model must already exist locally.
  • The configured host must match your Ollama setup.
  • Context size should be large enough for your use case.

If the selected Ollama model is not installed locally, indexing or querying will fail.


Related Pages