Skip to content

Guides

Memory Consolidation

LLM-powered merging that achieves 70% compression while preserving semantic fidelity.

Overview

Memory consolidation is an LLM-powered process that identifies redundant or overlapping memories and merges them, reducing storage footprint and retrieval noise while preserving key information. The system achieves approximately 70% compression with high semantic fidelity.

How It Works

  • Identification: the consolidation service scans for memories with high pairwise similarity
  • Clustering: similar memories are grouped into merge candidates
  • Merging: GPT-4 generates a consolidated memory that preserves all unique information from the cluster
  • Re-encoding: the merged memory is re-encoded across all four vector spaces
  • Graph maintenance: relationship edges are updated to point to the merged memory

Running Consolidation

import { ConsolidationService } from 'metamemory';

const consolidator = new ConsolidationService(engine);

const result = await consolidator.run({
  userId: 'agent-1',
  similarityThreshold: 0.85, // memories above this similarity are merge candidates
  minClusterSize: 2,
  maxClusterSize: 10,
});

console.log(result.merged);     // number of memories created
console.log(result.removed);    // number of originals replaced
console.log(result.compression); // e.g., 0.72 (72% reduction)

Semantic Fidelity

The merging prompt instructs GPT-4 to preserve all unique facts, entities, relationships, and emotional context from the source memories. After merging, the system verifies that the consolidated memory's embeddings maintain high cosine similarity with each source memory.

Scheduling

Consolidation is typically run as a background process. You can trigger it manually or schedule it:

// Run after every N memories are created
engine.on('memoryCreated', async (count) => {
  if (count % 100 === 0) {
    await consolidator.run({ userId: 'agent-1' });
  }
});