Mental Models: Pre-Computed Reflections
Mental models are pre-computed, cached reflections that capture the current synthesized state of the memories in a bank. They are generated by running a Reflect query and can be automatically or manually refreshed as the bank's knowledge changes.
Because a mental model is pre-computed, retrieving one is a fast lookup — no LLM call is needed at read time, unlike a live Reflect call. This makes mental models ideal for information you need frequently and quickly, such as user preferences. For example, if you create an auto-refreshing mental model for user preferences, it will always contain an up-to-date synthesis of all the memories and observations about that user's preferences.
Mental models are updated asynchronously. Even with auto-refresh enabled, they are eventually consistent — there may be a brief delay between new memories being added and the model reflecting them.
Overview
Mental models provide:
- Pre-computed retrieval — Content is generated ahead of time, so reads are instant with no LLM call required
- Cached reflections — Each model is the result of a Reflect query, synthesizing information across multiple memories into a single up-to-date summary
- Automatic or manual refresh — Optionally auto-refresh after memory consolidation, or manually refresh on demand
- Hierarchical priority — During Reflect, mental models are checked first and injected as high-priority context
- Tag-based organization — Filter and categorize models with tags
- Eventually consistent — Updates happen asynchronously in the background
Use Cases
| Use Case | Source Query Example |
|---|---|
| User Profile | "What do we know about this user's preferences and background?" |
| FAQ Bot | "What are the most common questions and their answers?" |
| Status Report | "What is the current status of the project?" |
| Team Directory | "Who works here and what do they do?" |
| Onboarding Guide | "What does a new team member need to know?" |
| Policy Summary | "What are the key policies and guidelines?" |
Creating a Mental Model
Mental model creation is asynchronous — it runs a Reflect query in the background and returns an operation_id to track progress.
- Python
- TypeScript
- cURL
from hindsight_client import Hindsight
client = Hindsight(
base_url="https://api.hindsight.vectorize.io",
api_key="your-api-key"
)
# Create a mental model
result = client.create_mental_model(
bank_id="my-assistant",
name="User Profile",
source_query="What do we know about this user's preferences and background?"
)
print(f"Creating mental model (operation: {result.operation_id})")
import { HindsightClient } from '@vectorize-io/hindsight-client';
const client = new HindsightClient({
baseUrl: 'https://api.hindsight.vectorize.io',
apiKey: 'your-api-key'
});
// Create a mental model
const result = await client.createMentalModel(
'my-assistant',
'User Profile',
'What do we know about this user\'s preferences and background?'
);
console.log(`Creating mental model (operation: ${result.operation_id})`);
curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"name": "User Profile",
"source_query": "What do we know about this user'\''s preferences and background?"
}'
Create with Options
You can specify tags, token limits, and automatic refresh behavior:
- Python
- TypeScript
- cURL
result = client.create_mental_model(
bank_id="my-assistant",
name="Team Directory",
source_query="Who works here and what do they do?",
tags=["team", "directory"],
max_tokens=4096,
trigger={"refresh_after_consolidation": True}
)
const result = await client.createMentalModel(
'my-assistant',
'Team Directory',
'Who works here and what do they do?',
{
tags: ['team', 'directory'],
maxTokens: 4096,
trigger: { refreshAfterConsolidation: true }
}
);
curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"name": "Team Directory",
"source_query": "Who works here and what do they do?",
"tags": ["team", "directory"],
"max_tokens": 4096,
"trigger": {
"refresh_after_consolidation": true
}
}'
Create Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Human-readable name for the mental model |
source_query | string | Yes | The query to run through Reflect to generate content |
tags | string[] | No | Tags for filtering and organization |
max_tokens | integer | No | Maximum tokens for generated content (default: 2048, range: 256–8192) |
trigger | object | No | Automatic refresh settings |
trigger.refresh_after_consolidation | boolean | No | Re-generate after memory consolidation (default: false) |
Create Response
{
"operation_id": "op_abc123"
}
The operation_id can be used to track the creation progress. The model will be available once the background Reflect operation completes.
Listing Mental Models
- Python
- TypeScript
- cURL
# List all mental models
models = client.list_mental_models(bank_id="my-assistant")
for model in models.items:
print(f"{model.name}: {model.content[:100]}...")
# Filter by tags
models = client.list_mental_models(bank_id="my-assistant", tags=["team"])
// List all mental models
const models = await client.listMentalModels('my-assistant');
models.items.forEach(model => {
console.log(`${model.name}: ${model.content?.substring(0, 100)}...`);
});
// Filter by tags
const filtered = await client.listMentalModels('my-assistant', {
tags: ['team']
});
# List all mental models
curl -X GET https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models \
-H "Authorization: Bearer your-api-key"
# Filter by tags
curl -X GET "https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models?tags=team" \
-H "Authorization: Bearer your-api-key"
List Response
{
"items": [
{
"id": "mm_abc123",
"bank_id": "my-assistant",
"name": "User Profile",
"source_query": "What do we know about this user?",
"content": "The user is a software engineer who prefers dark mode...",
"tags": [],
"max_tokens": 2048,
"trigger": {
"refresh_after_consolidation": false
},
"last_refreshed_at": "2024-03-15T10:30:00Z",
"created_at": "2024-03-15T10:30:00Z"
}
]
}
Getting a Mental Model
- Python
- TypeScript
- cURL
model = client.get_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123"
)
print(f"Name: {model.name}")
print(f"Content: {model.content}")
print(f"Last refreshed: {model.last_refreshed_at}")
const model = await client.getMentalModel('my-assistant', 'mm_abc123');
console.log(`Name: ${model.name}`);
console.log(`Content: ${model.content}`);
console.log(`Last refreshed: ${model.last_refreshed_at}`);
curl -X GET https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models/mm_abc123 \
-H "Authorization: Bearer your-api-key"
Mental Model Response
| Field | Description |
|---|---|
id | Unique identifier for the mental model |
bank_id | The memory bank this model belongs to |
name | Human-readable name |
source_query | The Reflect query used to generate content |
content | The generated content (synthesized from memories) |
tags | Tags for filtering |
max_tokens | Maximum token limit for content |
trigger | Automatic refresh settings |
last_refreshed_at | When the model was last refreshed |
created_at | When the model was created |
reflect_response | Full Reflect response including based_on sources |
Refreshing a Mental Model
Refreshing re-runs the source query through Reflect to update the content with the latest memories. This is an asynchronous operation.
- Python
- TypeScript
- cURL
# Manually refresh a mental model
result = client.refresh_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123"
)
print(f"Refresh operation: {result.operation_id}")
// Manually refresh a mental model
const result = await client.refreshMentalModel('my-assistant', 'mm_abc123');
console.log(`Refresh operation: ${result.operation_id}`);
curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models/mm_abc123/refresh \
-H "Authorization: Bearer your-api-key"
Auto-Refresh
When trigger.refresh_after_consolidation is enabled, the mental model automatically refreshes after the Retain operation consolidates new memories into observations. This keeps the model's cached reflection up-to-date without manual intervention.
Because refreshes happen asynchronously, auto-refreshing mental models are eventually consistent — there may be a short delay between new memories being consolidated and the model's content being updated.
When to enable auto-refresh:
- Models that need to stay current with frequently changing data (e.g., user preferences, status reports)
- Banks that receive regular memory updates
When to keep manual refresh:
- Models that summarize stable information (e.g., policy documents, onboarding guides)
- When you want to control exactly when updates happen
- To minimize token usage on models that don't need frequent updates
Manual Refresh
You can refresh any mental model on demand by calling the refresh endpoint, regardless of whether auto-refresh is enabled. This is useful when you know the bank's contents have changed and you want the model updated immediately rather than waiting for the next consolidation cycle.
Updating a Mental Model
Update a model's name, source query, tags, token limit, or trigger settings:
- Python
- TypeScript
- cURL
model = client.update_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123",
name="Updated Profile",
source_query="What are the user's key preferences?",
trigger={"refresh_after_consolidation": True}
)
const model = await client.updateMentalModel('my-assistant', 'mm_abc123', {
name: 'Updated Profile',
sourceQuery: 'What are the user\'s key preferences?',
trigger: { refreshAfterConsolidation: true }
});
curl -X PATCH https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models/mm_abc123 \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"name": "Updated Profile",
"source_query": "What are the user'\''s key preferences?",
"trigger": {
"refresh_after_consolidation": true
}
}'
Updating the source_query does not automatically regenerate the content. Call refresh after updating the source query to regenerate with the new query.
Deleting a Mental Model
- Python
- TypeScript
- cURL
client.delete_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123"
)
await client.deleteMentalModel('my-assistant', 'mm_abc123');
curl -X DELETE https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models/mm_abc123 \
-H "Authorization: Bearer your-api-key"
How Mental Models Work with Reflect
When you call Reflect, the system automatically:
- Retrieves relevant mental models based on the query
- Injects them as high-priority context alongside retrieved memories
- Synthesizes an answer that draws from both mental models and raw memories
This means mental models act as a layer of pre-computed knowledge that improves Reflect quality and consistency. The Reflect response includes a mental_models field showing which models were used:
{
"text": "Based on the stored memories...",
"based_on": [],
"mental_models": [
{
"id": "mm_abc123",
"text": "The user is a software engineer who prefers..."
}
]
}
Best Practices
Source Query Design
Write clear, specific source queries that target the information you want synthesized:
- Good
- Less Effective
"What are the user's communication preferences, including preferred channels, response times, and meeting styles?"
"Tell me about the user"
Naming Conventions
Use descriptive names that make it clear what the model contains:
- "Customer Support FAQ" instead of "FAQ"
- "Q2 Project Status" instead of "Status"
- "Engineering Team Directory" instead of "Team"
Organization with Tags
Use tags to categorize models for easier retrieval:
# Create related models with shared tags
client.create_mental_model(
bank_id="my-assistant",
name="Team Skills Matrix",
source_query="What skills does each team member have?",
tags=["team", "skills"]
)
client.create_mental_model(
bank_id="my-assistant",
name="Team Availability",
source_query="What are the team members' schedules and availability?",
tags=["team", "scheduling"]
)
# Retrieve all team-related models
team_models = client.list_mental_models(bank_id="my-assistant", tags=["team"])
Using in the UI
The Mental Models view in memory banks provides a visual interface:
- Navigate to your memory bank
- Click Mental Models in the sidebar
- Click Create to add a new mental model
- View, edit, refresh, or delete existing models
- Toggle auto-refresh per model
RBAC is enforced: organization members have read-only access, while admins can create, edit, and delete.
Error Handling
- Python
- TypeScript
try:
model = client.get_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123"
)
print(model.content)
except Exception as e:
print(f"Error: {e}")
try {
const model = await client.getMentalModel('my-assistant', 'mm_abc123');
console.log(model.content);
} catch (error) {
console.error('Error:', error.message);
}
Common Errors
| Error | Cause | Solution |
|---|---|---|
| 401 Unauthorized | Invalid API key | Check your API key |
| 402 Payment Required | Insufficient credits | Add credits to your account |
| 404 Not Found | Invalid bank_id or mental_model_id | Verify the bank and model exist |
| 400 Bad Request | Missing required fields | Provide name and source_query |
Token Usage
Mental model operations consume tokens and are billed accordingly:
| Operation | Description |
|---|---|
| Get Model | Lightweight lookup returning cached content |
| Refresh Model | Runs Reflect to regenerate content |
Creating a mental model uses the same token cost as refreshing, since creation runs a Reflect query internally.
Current pricing for each operation is available in the application. Monitor your mental model token usage on the Usage Analytics page.