Skip to content

Memory

The memory service provides semantic memory stores for AI agents — enabling them to remember context across conversations using vector-based storage and retrieval.

Concepts

ConceptDescription
Memory StoreConfiguration container linking an embedding model and vector index
Memory ItemA single memory entry with content, tags, scope, and importance
SearchSemantic similarity search across stored memories
RecallChat-optimized retrieval with token limit awareness

Service Functions

FunctionDescription
createMemoryStore()Create store with vector index + embedding model
listMemoryStores()List stores for project
getMemoryStore()Get store by key
updateMemoryStore()Update config/status
deleteMemoryStore()Delete store + vector index
addMemory()Add single item (embed → store)
addMemoryBatch()Batch add items
listMemoryItems()List items in store
getMemoryItem()Get single item
updateMemoryItem()Update content (re-embed if changed)
deleteMemoryItem()Delete single item
deleteMemoryItemsBulk()Bulk delete
searchMemories()Semantic search
recallForChat()Chat-optimized recall with token limit

Create Store Request

json
{
  "name": "Agent Working Memory",
  "vectorProviderKey": "pinecone-prod",
  "embeddingModelKey": "text-embedding-ada-002"
}

The store key is generated automatically, and creating a store provisions the backing vector index.

Client API

Create Memory Store

POST /api/client/v1/memory/stores
Authorization: Bearer <token>

Add Memory

POST /api/client/v1/memory/stores/:storeKey/memories
Authorization: Bearer <token>
json
{
  "content": "User prefers dark mode and concise responses",
  "scope": "user",
  "tags": ["preferences", "ui"],
  "importance": 0.8,
  "metadata": { "userId": "user-123" }
}

Search Memories

POST /api/client/v1/memory/stores/:storeKey/search
Authorization: Bearer <token>
json
{
  "query": "What are the user's UI preferences?",
  "topK": 5,
  "scope": "user",
  "scopeId": "user-123",
  "tags": ["preferences"],
  "minScore": 0.7
}

Response:

json
{
  "memories": [
    {
      "id": "mem_123",
      "content": "User prefers dark mode and concise responses",
      "score": 0.94,
      "scope": "user",
      "scopeId": "user-123",
      "tags": ["preferences", "ui"],
      "metadata": { "userId": "user-123" },
      "importance": 0.8
    }
  ],
  "query": "What are the user's UI preferences?",
  "storeKey": "mem-agent-working-memory"
}

Recall for Chat

POST /api/client/v1/memory/stores/:storeKey/recall
Authorization: Bearer <token>
json
{
  "query": "What does this user like?",
  "scope": "user",
  "scopeId": "user-123",
  "topK": 5,
  "maxTokens": 500
}

The recall function is optimized for chat contexts. It returns a compact context string together with the matched memories, trimmed to fit the requested token budget.

Memory Properties

FieldDescription
contentThe memory text
scopeCategorization (e.g., user, session, global)
sourceOrigin identifier
tagsArray of string tags for filtering
importanceFloat 0-1, influences ranking
metadataArbitrary key-value pairs

Pipeline

Adding a memory:

Content → Embed (via inference service) → Vector Upsert → Store metadata

Searching memories:

Query → Embed → Vector Search → Rank by score + importance → Return

Dependencies

  • Inference Service — For generating embeddings
  • Vector Service — For storing and querying vectors

Community edition is AGPL-3.0. Commercial licensing and support are available separately.