Skip to main content

Mental Models: Pre-Computed Reflections

Mental models are pre-computed, cached reflections that capture the current synthesized state of the memories in a bank. They are generated by running a Reflect query and can be automatically or manually refreshed as the bank's knowledge changes.

Because a mental model is pre-computed, retrieving one is a fast lookup — no LLM call is needed at read time, unlike a live Reflect call. This makes mental models ideal for information you need frequently and quickly, such as user preferences. For example, if you create an auto-refreshing mental model for user preferences, it will always contain an up-to-date synthesis of all the memories and observations about that user's preferences.

note

Mental models are updated asynchronously. Even with auto-refresh enabled, they are eventually consistent — there may be a brief delay between new memories being added and the model reflecting them.

Overview

Mental models provide:

  • Pre-computed retrieval — Content is generated ahead of time, so reads are instant with no LLM call required
  • Cached reflections — Each model is the result of a Reflect query, synthesizing information across multiple memories into a single up-to-date summary
  • Automatic or manual refresh — Optionally auto-refresh after memory consolidation, or manually refresh on demand
  • Hierarchical priority — During Reflect, mental models are checked first and injected as high-priority context
  • Tag-based organization — Filter and categorize models with tags
  • Eventually consistent — Updates happen asynchronously in the background

Use Cases

Use CaseSource Query Example
User Profile"What do we know about this user's preferences and background?"
FAQ Bot"What are the most common questions and their answers?"
Status Report"What is the current status of the project?"
Team Directory"Who works here and what do they do?"
Onboarding Guide"What does a new team member need to know?"
Policy Summary"What are the key policies and guidelines?"

Creating a Mental Model

Mental model creation is asynchronous — it runs a Reflect query in the background and returns an operation_id to track progress.

from hindsight_client import Hindsight

client = Hindsight(
base_url="https://api.hindsight.vectorize.io",
api_key="your-api-key"
)

# Create a mental model
result = client.create_mental_model(
bank_id="my-assistant",
name="User Profile",
source_query="What do we know about this user's preferences and background?"
)

print(f"Creating mental model (operation: {result.operation_id})")

Create with Options

You can specify tags, token limits, and automatic refresh behavior:

result = client.create_mental_model(
bank_id="my-assistant",
name="Team Directory",
source_query="Who works here and what do they do?",
tags=["team", "directory"],
max_tokens=4096,
trigger={"refresh_after_consolidation": True}
)

Create Parameters

ParameterTypeRequiredDescription
namestringYesHuman-readable name for the mental model
source_querystringYesThe query to run through Reflect to generate content
tagsstring[]NoTags for filtering and organization
max_tokensintegerNoMaximum tokens for generated content (default: 2048, range: 256–8192)
triggerobjectNoAutomatic refresh settings
trigger.refresh_after_consolidationbooleanNoRe-generate after memory consolidation (default: false)

Create Response

{
"operation_id": "op_abc123"
}

The operation_id can be used to track the creation progress. The model will be available once the background Reflect operation completes.

Listing Mental Models

# List all mental models
models = client.list_mental_models(bank_id="my-assistant")

for model in models.items:
print(f"{model.name}: {model.content[:100]}...")

# Filter by tags
models = client.list_mental_models(bank_id="my-assistant", tags=["team"])

List Response

{
"items": [
{
"id": "mm_abc123",
"bank_id": "my-assistant",
"name": "User Profile",
"source_query": "What do we know about this user?",
"content": "The user is a software engineer who prefers dark mode...",
"tags": [],
"max_tokens": 2048,
"trigger": {
"refresh_after_consolidation": false
},
"last_refreshed_at": "2024-03-15T10:30:00Z",
"created_at": "2024-03-15T10:30:00Z"
}
]
}

Getting a Mental Model

model = client.get_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123"
)

print(f"Name: {model.name}")
print(f"Content: {model.content}")
print(f"Last refreshed: {model.last_refreshed_at}")

Mental Model Response

FieldDescription
idUnique identifier for the mental model
bank_idThe memory bank this model belongs to
nameHuman-readable name
source_queryThe Reflect query used to generate content
contentThe generated content (synthesized from memories)
tagsTags for filtering
max_tokensMaximum token limit for content
triggerAutomatic refresh settings
last_refreshed_atWhen the model was last refreshed
created_atWhen the model was created
reflect_responseFull Reflect response including based_on sources

Refreshing a Mental Model

Refreshing re-runs the source query through Reflect to update the content with the latest memories. This is an asynchronous operation.

# Manually refresh a mental model
result = client.refresh_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123"
)

print(f"Refresh operation: {result.operation_id}")

Auto-Refresh

When trigger.refresh_after_consolidation is enabled, the mental model automatically refreshes after the Retain operation consolidates new memories into observations. This keeps the model's cached reflection up-to-date without manual intervention.

Because refreshes happen asynchronously, auto-refreshing mental models are eventually consistent — there may be a short delay between new memories being consolidated and the model's content being updated.

When to enable auto-refresh:

  • Models that need to stay current with frequently changing data (e.g., user preferences, status reports)
  • Banks that receive regular memory updates

When to keep manual refresh:

  • Models that summarize stable information (e.g., policy documents, onboarding guides)
  • When you want to control exactly when updates happen
  • To minimize token usage on models that don't need frequent updates

Manual Refresh

You can refresh any mental model on demand by calling the refresh endpoint, regardless of whether auto-refresh is enabled. This is useful when you know the bank's contents have changed and you want the model updated immediately rather than waiting for the next consolidation cycle.

Updating a Mental Model

Update a model's name, source query, tags, token limit, or trigger settings:

model = client.update_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123",
name="Updated Profile",
source_query="What are the user's key preferences?",
trigger={"refresh_after_consolidation": True}
)
note

Updating the source_query does not automatically regenerate the content. Call refresh after updating the source query to regenerate with the new query.

Deleting a Mental Model

client.delete_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123"
)

How Mental Models Work with Reflect

When you call Reflect, the system automatically:

  1. Retrieves relevant mental models based on the query
  2. Injects them as high-priority context alongside retrieved memories
  3. Synthesizes an answer that draws from both mental models and raw memories

This means mental models act as a layer of pre-computed knowledge that improves Reflect quality and consistency. The Reflect response includes a mental_models field showing which models were used:

{
"text": "Based on the stored memories...",
"based_on": [],
"mental_models": [
{
"id": "mm_abc123",
"text": "The user is a software engineer who prefers..."
}
]
}

Best Practices

Source Query Design

Write clear, specific source queries that target the information you want synthesized:

"What are the user's communication preferences, including preferred channels, response times, and meeting styles?"

Naming Conventions

Use descriptive names that make it clear what the model contains:

  • "Customer Support FAQ" instead of "FAQ"
  • "Q2 Project Status" instead of "Status"
  • "Engineering Team Directory" instead of "Team"

Organization with Tags

Use tags to categorize models for easier retrieval:

# Create related models with shared tags
client.create_mental_model(
bank_id="my-assistant",
name="Team Skills Matrix",
source_query="What skills does each team member have?",
tags=["team", "skills"]
)

client.create_mental_model(
bank_id="my-assistant",
name="Team Availability",
source_query="What are the team members' schedules and availability?",
tags=["team", "scheduling"]
)

# Retrieve all team-related models
team_models = client.list_mental_models(bank_id="my-assistant", tags=["team"])

Using in the UI

The Mental Models view in memory banks provides a visual interface:

  1. Navigate to your memory bank
  2. Click Mental Models in the sidebar
  3. Click Create to add a new mental model
  4. View, edit, refresh, or delete existing models
  5. Toggle auto-refresh per model

RBAC is enforced: organization members have read-only access, while admins can create, edit, and delete.

Error Handling

try:
model = client.get_mental_model(
bank_id="my-assistant",
mental_model_id="mm_abc123"
)
print(model.content)
except Exception as e:
print(f"Error: {e}")

Common Errors

ErrorCauseSolution
401 UnauthorizedInvalid API keyCheck your API key
402 Payment RequiredInsufficient creditsAdd credits to your account
404 Not FoundInvalid bank_id or mental_model_idVerify the bank and model exist
400 Bad RequestMissing required fieldsProvide name and source_query

Token Usage

Mental model operations consume tokens and are billed accordingly:

OperationDescription
Get ModelLightweight lookup returning cached content
Refresh ModelRuns Reflect to regenerate content

Creating a mental model uses the same token cost as refreshing, since creation runs a Reflect query internally.

Current pricing for each operation is available in the application. Monitor your mental model token usage on the Usage Analytics page.