Recall & Context Injection
Recall & Context Injection
The recall tool is the primary way agents access their memory context. Rather than searching for specific information, agents pass their current context to recall and get back the memories most likely to be relevant.
How it works
recall combines two signals:
- Relevance — hybrid BM25+vector search scores the current context against stored memories
- Salience — FSRS-computed importance weights memories that have been reinforced or recently accessed
The final ranking fuses both scores. A memory that’s both semantically relevant and highly salient ranks highest.
Usage pattern
The recommended agent pattern is to call recall at the start of each conversation turn:
1. User sends message2. Agent calls recall(context=user_message)3. memcp returns top N relevant memories4. Agent includes memories in its context window5. Agent generates response6. Agent calls store_memory for anything worth rememberingThis keeps the agent’s working memory small while ensuring relevant long-term context is always available.
Salience decay
Memories that aren’t recalled decay in salience over time — the same way human memory works. This prevents the memory store from becoming dominated by old, irrelevant entries.
You can override this by setting a high importance value when storing critical information that should resist decay.
Example
{ "tool": "recall", "arguments": { "context": "Can you remind me what we decided about the database schema last week?", "limit": 5 }}Response:
{ "memories": [ { "id": "mem_01jk9ab123", "content": "Decided to use PostgreSQL with a separate audit log table. UUID primary keys throughout. Discussed May 22nd.", "salience": 0.87, "tags": ["decision", "database", "schema"] } ]}