Skip to content

Integration

Using MetaMemory with Ollama (Self-Hosted)

Ollama enables fully self-hosted embedding generation, and MetaMemory's integration brings production-grade memory management to local deployments. For teams with strict data privacy requirements — healthcare, finance, government, or any environment where data cannot leave the network — Ollama eliminates the need to send content to third-party APIs entirely. The nomic-embed-text model is the recommended default: it produces 768-dimensional vectors with quality that rivals commercial embedding APIs, runs efficiently on consumer hardware, and is completely open-source. For higher-dimensional representations, mxbai-embed-large offers 1024-dimensional vectors with stronger performance on complex content. The all-minilm model provides a lightweight alternative at 384 dimensions for resource-constrained environments. MetaMemory's Ollama integration requires a base URL pointing to your Ollama instance, which can run on the same machine, a local server, or any network-accessible host. The system automatically detects available models and their dimensions. Because Ollama runs locally, there are no API costs, no rate limits, and no network latency for embedding generation — which makes it the fastest option for high-throughput memory encoding. MetaMemory handles all the specifics of Ollama's API format and provides the same multi-vector encoding, adaptive retrieval, and consolidation capabilities as cloud-hosted providers. For teams that want the full MetaMemory experience with zero external dependencies, Ollama is the clear choice.

Setup Guide

1

Install Ollama and Pull an Embedding Model

Download and install Ollama from ollama.com — it is available for macOS, Linux, and Windows. Once installed, open a terminal and run "ollama pull nomic-embed-text" to download the recommended embedding model. This downloads around 275MB and takes a few minutes depending on your connection. Verify the installation by running "ollama list" to confirm the model appears. You can also pull additional models like mxbai-embed-large for higher quality.

2

Configure Ollama in MetaMemory

In your MetaMemory dashboard, go to Settings then Provider Keys and select "Ollama" as the provider. Instead of an API key, you will provide the base URL where your Ollama instance is running — typically http://localhost:11434 for a local installation. Select nomic-embed-text as the default model. MetaMemory will connect to your Ollama instance, verify the model is available, and confirm the embedding dimensions. No API key is needed since Ollama runs locally.

3

Test Local Memory Storage

Call the MetaMemory API to store a test memory. The system routes your content to your local Ollama instance for embedding generation — no data leaves your network. You should notice faster embedding latency compared to cloud providers since there is no network round-trip. Test retrieval to confirm the full pipeline works. If you plan to run MetaMemory in production with Ollama, consider dedicating GPU resources to the Ollama instance for consistent throughput.

Configuration Example

curl -X POST https://api.metamemory.tech/v1/providers \
  -H "Authorization: Bearer YOUR_METAMEMORY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "ollama",
    "base_url": "http://localhost:11434",
    "default_model": "nomic-embed-text",
    "settings": {
      "keep_alive": "5m",
      "num_ctx": 8192
    }
  }'

Supported Models

nomic-embed-textDefault
mxbai-embed-large
all-minilm

Capabilities

EmbeddingsLLM

Ready to use Ollama with MetaMemory?

Get started in minutes. Connect your Ollama API key and give your agents persistent, intelligent memory.

Your agents deserve to remember

Bring your own AI keys. Integrate in minutes. Your data stays yours.