Mental Models: Pre-Computed Reflections

Mental models are pre-computed, cached reflections that capture the current synthesized state of the memories in a bank. They are generated by running a Reflect query and can be automatically or manually refreshed as the bank's knowledge changes.

Because a mental model is pre-computed, retrieving one is a fast lookup — no LLM call is needed at read time, unlike a live Reflect call. This makes mental models ideal for information you need frequently and quickly, such as user preferences. For example, if you create an auto-refreshing mental model for user preferences, it will always contain an up-to-date synthesis of all the memories and observations about that user's preferences.

note

Mental models are updated asynchronously. Even with auto-refresh enabled, they are eventually consistent — there may be a brief delay between new memories being added and the model reflecting them.

Overview

Mental models provide:

Pre-computed retrieval — Content is generated ahead of time, so reads are instant with no LLM call required
Cached reflections — Each model is the result of a Reflect query, synthesizing information across multiple memories into a single up-to-date summary
Automatic or manual refresh — Optionally auto-refresh after memory consolidation, or manually refresh on demand
Hierarchical priority — During Reflect, mental models are checked first and injected as high-priority context
Tag-based organization — Filter and categorize models with tags
Eventually consistent — Updates happen asynchronously in the background

Use Cases

Use Case	Source Query Example
User Profile	"What do we know about this user's preferences and background?"
FAQ Bot	"What are the most common questions and their answers?"
Status Report	"What is the current status of the project?"
Team Directory	"Who works here and what do they do?"
Onboarding Guide	"What does a new team member need to know?"
Policy Summary	"What are the key policies and guidelines?"

Creating a Mental Model

Mental model creation is asynchronous — it runs a Reflect query in the background and returns an operation_id to track progress.

Python
TypeScript
cURL

from hindsight_client import Hindsight

client = Hindsight(
    base_url="https://api.hindsight.vectorize.io",
    api_key="your-api-key"
)

# Create a mental model
result = client.create_mental_model(
    bank_id="my-assistant",
    name="User Profile",
    source_query="What do we know about this user's preferences and background?"
)

print(f"Creating mental model (operation: {result.operation_id})")

import { HindsightClient } from '@vectorize-io/hindsight-client';

const client = new HindsightClient({
  baseUrl: 'https://api.hindsight.vectorize.io',
  apiKey: 'your-api-key'
});

// Create a mental model
const result = await client.createMentalModel(
  'my-assistant',
  'User Profile',
  'What do we know about this user\'s preferences and background?'
);

console.log(`Creating mental model (operation: ${result.operation_id})`);

curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "User Profile",
    "source_query": "What do we know about this user'\''s preferences and background?"
  }'

Create with Options

You can specify tags, token limits, and automatic refresh behavior:

Python
TypeScript
cURL

result = client.create_mental_model(
    bank_id="my-assistant",
    name="Team Directory",
    source_query="Who works here and what do they do?",
    tags=["team", "directory"],
    max_tokens=4096,
    trigger={"refresh_after_consolidation": True}
)

const result = await client.createMentalModel(
  'my-assistant',
  'Team Directory',
  'Who works here and what do they do?',
  {
    tags: ['team', 'directory'],
    maxTokens: 4096,
    trigger: { refreshAfterConsolidation: true }
  }
);

curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Team Directory",
    "source_query": "Who works here and what do they do?",
    "tags": ["team", "directory"],
    "max_tokens": 4096,
    "trigger": {
      "refresh_after_consolidation": true
    }
  }'

Create Parameters

Parameter	Type	Required	Description
`name`	string	Yes	Human-readable name for the mental model
`source_query`	string	Yes	The query to run through Reflect to generate content
`tags`	string[]	No	Tags for filtering and organization
`max_tokens`	integer	No	Maximum tokens for generated content (default: 2048, range: 256–8192)
`trigger`	object	No	Automatic refresh settings
`trigger.refresh_after_consolidation`	boolean	No	Re-generate after memory consolidation (default: false)

Create Response

{
  "operation_id": "op_abc123"
}

The operation_id can be used to track the creation progress. The model will be available once the background Reflect operation completes.

Listing Mental Models

Python
TypeScript
cURL

# List all mental models
models = client.list_mental_models(bank_id="my-assistant")

for model in models.items:
    print(f"{model.name}: {model.content[:100]}...")

# Filter by tags
models = client.list_mental_models(bank_id="my-assistant", tags=["team"])

// List all mental models
const models = await client.listMentalModels('my-assistant');

models.items.forEach(model => {
  console.log(`${model.name}: ${model.content?.substring(0, 100)}...`);
});

// Filter by tags
const filtered = await client.listMentalModels('my-assistant', {
  tags: ['team']
});

# List all mental models
curl -X GET https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models \
  -H "Authorization: Bearer your-api-key"

# Filter by tags
curl -X GET "https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models?tags=team" \
  -H "Authorization: Bearer your-api-key"

List Response

{
  "items": [
    {
      "id": "mm_abc123",
      "bank_id": "my-assistant",
      "name": "User Profile",
      "source_query": "What do we know about this user?",
      "content": "The user is a software engineer who prefers dark mode...",
      "tags": [],
      "max_tokens": 2048,
      "trigger": {
        "refresh_after_consolidation": false
      },
      "last_refreshed_at": "2024-03-15T10:30:00Z",
      "created_at": "2024-03-15T10:30:00Z"
    }
  ]
}

Getting a Mental Model

Python
TypeScript
cURL

model = client.get_mental_model(
    bank_id="my-assistant",
    mental_model_id="mm_abc123"
)

print(f"Name: {model.name}")
print(f"Content: {model.content}")
print(f"Last refreshed: {model.last_refreshed_at}")

const model = await client.getMentalModel('my-assistant', 'mm_abc123');

console.log(`Name: ${model.name}`);
console.log(`Content: ${model.content}`);
console.log(`Last refreshed: ${model.last_refreshed_at}`);

curl -X GET https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models/mm_abc123 \
  -H "Authorization: Bearer your-api-key"

Mental Model Response

Field	Description
`id`	Unique identifier for the mental model
`bank_id`	The memory bank this model belongs to
`name`	Human-readable name
`source_query`	The Reflect query used to generate content
`content`	The generated content (synthesized from memories)
`tags`	Tags for filtering
`max_tokens`	Maximum token limit for content
`trigger`	Automatic refresh settings
`last_refreshed_at`	When the model was last refreshed
`created_at`	When the model was created
`reflect_response`	Full Reflect response including `based_on` sources

Refreshing a Mental Model

Refreshing re-runs the source query through Reflect to update the content with the latest memories. This is an asynchronous operation.

Python
TypeScript
cURL

# Manually refresh a mental model
result = client.refresh_mental_model(
    bank_id="my-assistant",
    mental_model_id="mm_abc123"
)

print(f"Refresh operation: {result.operation_id}")

// Manually refresh a mental model
const result = await client.refreshMentalModel('my-assistant', 'mm_abc123');

console.log(`Refresh operation: ${result.operation_id}`);

curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models/mm_abc123/refresh \
  -H "Authorization: Bearer your-api-key"

Auto-Refresh

When trigger.refresh_after_consolidation is enabled, the mental model automatically refreshes after the Retain operation consolidates new memories into observations. This keeps the model's cached reflection up-to-date without manual intervention.

Because refreshes happen asynchronously, auto-refreshing mental models are eventually consistent — there may be a short delay between new memories being consolidated and the model's content being updated.

When to enable auto-refresh:

Models that need to stay current with frequently changing data (e.g., user preferences, status reports)
Banks that receive regular memory updates

When to keep manual refresh:

Models that summarize stable information (e.g., policy documents, onboarding guides)
When you want to control exactly when updates happen
To minimize token usage on models that don't need frequent updates

Manual Refresh

You can refresh any mental model on demand by calling the refresh endpoint, regardless of whether auto-refresh is enabled. This is useful when you know the bank's contents have changed and you want the model updated immediately rather than waiting for the next consolidation cycle.

Updating a Mental Model

Update a model's name, source query, tags, token limit, or trigger settings:

Python
TypeScript
cURL

model = client.update_mental_model(
    bank_id="my-assistant",
    mental_model_id="mm_abc123",
    name="Updated Profile",
    source_query="What are the user's key preferences?",
    trigger={"refresh_after_consolidation": True}
)

const model = await client.updateMentalModel('my-assistant', 'mm_abc123', {
  name: 'Updated Profile',
  sourceQuery: 'What are the user\'s key preferences?',
  trigger: { refreshAfterConsolidation: true }
});

curl -X PATCH https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models/mm_abc123 \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Updated Profile",
    "source_query": "What are the user'\''s key preferences?",
    "trigger": {
      "refresh_after_consolidation": true
    }
  }'

note

Updating the source_query does not automatically regenerate the content. Call refresh after updating the source query to regenerate with the new query.

Deleting a Mental Model

Python
TypeScript
cURL

client.delete_mental_model(
    bank_id="my-assistant",
    mental_model_id="mm_abc123"
)

await client.deleteMentalModel('my-assistant', 'mm_abc123');

curl -X DELETE https://api.hindsight.vectorize.io/v1/default/banks/my-assistant/mental-models/mm_abc123 \
  -H "Authorization: Bearer your-api-key"

How Mental Models Work with Reflect

When you call Reflect, the system automatically:

Retrieves relevant mental models based on the query
Injects them as high-priority context alongside retrieved memories
Synthesizes an answer that draws from both mental models and raw memories

This means mental models act as a layer of pre-computed knowledge that improves Reflect quality and consistency. The Reflect response includes a mental_models field showing which models were used:

{
  "text": "Based on the stored memories...",
  "based_on": [],
  "mental_models": [
    {
      "id": "mm_abc123",
      "text": "The user is a software engineer who prefers..."
    }
  ]
}

Best Practices

Source Query Design

Write clear, specific source queries that target the information you want synthesized:

Good
Less Effective

"What are the user's communication preferences, including preferred channels, response times, and meeting styles?"

"Tell me about the user"

Naming Conventions

Use descriptive names that make it clear what the model contains:

"Customer Support FAQ" instead of "FAQ"
"Q2 Project Status" instead of "Status"
"Engineering Team Directory" instead of "Team"

Organization with Tags

Use tags to categorize models for easier retrieval:

# Create related models with shared tags
client.create_mental_model(
    bank_id="my-assistant",
    name="Team Skills Matrix",
    source_query="What skills does each team member have?",
    tags=["team", "skills"]
)

client.create_mental_model(
    bank_id="my-assistant",
    name="Team Availability",
    source_query="What are the team members' schedules and availability?",
    tags=["team", "scheduling"]
)

# Retrieve all team-related models
team_models = client.list_mental_models(bank_id="my-assistant", tags=["team"])

Using in the UI

The Mental Models view in memory banks provides a visual interface:

Navigate to your memory bank
Click Mental Models in the sidebar
Click Create to add a new mental model
View, edit, refresh, or delete existing models
Toggle auto-refresh per model

RBAC is enforced: organization members have read-only access, while admins can create, edit, and delete.

Error Handling

Python
TypeScript

try:
    model = client.get_mental_model(
        bank_id="my-assistant",
        mental_model_id="mm_abc123"
    )
    print(model.content)
except Exception as e:
    print(f"Error: {e}")

try {
  const model = await client.getMentalModel('my-assistant', 'mm_abc123');
  console.log(model.content);
} catch (error) {
  console.error('Error:', error.message);
}

Common Errors

Error	Cause	Solution
401 Unauthorized	Invalid API key	Check your API key
402 Payment Required	Insufficient credits	Add credits to your account
404 Not Found	Invalid bank_id or mental_model_id	Verify the bank and model exist
400 Bad Request	Missing required fields	Provide `name` and `source_query`

Token Usage

Mental model operations consume tokens and are billed accordingly:

Operation	Description
Get Model	Lightweight lookup returning cached content
Refresh Model	Runs Reflect to regenerate content

Creating a mental model uses the same token cost as refreshing, since creation runs a Reflect query internally.

Current pricing for each operation is available in the application. Monitor your mental model token usage on the Usage Analytics page.

Overview​

Use Cases​

Creating a Mental Model​

Create with Options​

Create Parameters​

Create Response​

Listing Mental Models​

List Response​

Getting a Mental Model​

Mental Model Response​

Refreshing a Mental Model​

Auto-Refresh​

Manual Refresh​

Updating a Mental Model​

Deleting a Mental Model​

How Mental Models Work with Reflect​

Best Practices​

Source Query Design​

Naming Conventions​

Organization with Tags​

Using in the UI​

Error Handling​

Common Errors​

Token Usage​