Skip to content

Recall & Context Injection

Recall & Context Injection

The recall tool is the primary way agents access their memory context. Rather than searching for specific information, agents pass their current context to recall and get back the memories most likely to be relevant.

How it works

recall combines two signals:

  1. Relevance — hybrid BM25+vector search scores the current context against stored memories
  2. Salience — FSRS-computed importance weights memories that have been reinforced or recently accessed

The final ranking fuses both scores. A memory that’s both semantically relevant and highly salient ranks highest.

Usage pattern

The recommended agent pattern is to call recall at the start of each conversation turn:

1. User sends message
2. Agent calls recall(context=user_message)
3. memcp returns top N relevant memories
4. Agent includes memories in its context window
5. Agent generates response
6. Agent calls store_memory for anything worth remembering

This keeps the agent’s working memory small while ensuring relevant long-term context is always available.

Salience decay

Memories that aren’t recalled decay in salience over time — the same way human memory works. This prevents the memory store from becoming dominated by old, irrelevant entries.

You can override this by setting a high importance value when storing critical information that should resist decay.

Example

{
"tool": "recall",
"arguments": {
"context": "Can you remind me what we decided about the database schema last week?",
"limit": 5
}
}

Response:

{
"memories": [
{
"id": "mem_01jk9ab123",
"content": "Decided to use PostgreSQL with a separate audit log table. UUID primary keys throughout. Discussed May 22nd.",
"salience": 0.87,
"tags": ["decision", "database", "schema"]
}
]
}