Reflect: Reasoning Over Memories

The Reflect operation performs agentic reasoning over stored memories, guided by the bank's mission, directives, and disposition traits. Unlike Recall which returns raw memories, Reflect synthesizes information and draws conclusions with confidence scores.

Overview

Reflect enables reasoning over memories:

Synthesize insights from stored memories
Reasoning influenced by disposition traits (skepticism, literalism, empathy)
Confidence scores based on evidence strength
Cited sources for transparency
Automatically incorporates relevant mental models as additional context

Basic Usage

Python
TypeScript
cURL

from hindsight_client import Hindsight

client = Hindsight(
    base_url="https://api.hindsight.vectorize.io",
    api_key="your-api-key"
)

# Ask a question
response = client.reflect(
    bank_id="your-bank-id",
    query="What are the key priorities for the project?"
)

print(response.text)
print("Based on:", response.based_on)

import { HindsightClient } from '@vectorize-io/hindsight-client';

const client = new HindsightClient({
  baseUrl: 'https://api.hindsight.vectorize.io',
  apiKey: 'your-api-key'
});

// Ask a question
const response = await client.reflect(
  'your-bank-id',
  'What are the key priorities for the project?'
);

console.log(response.text);
console.log('Based on:', response.based_on);

curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/{bank_id}/reflect \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the key priorities for the project?"
  }'

How It Works

When you call Reflect:

Question Analysis - The AI understands your question
Memory Retrieval - Relevant memories are automatically recalled using TEMPR (semantic, keyword, graph, temporal search)
Mental Model Injection - Relevant mental models are included as additional context
Reasoning - The AI synthesizes information from memories and mental models, influenced by the bank's disposition traits
Answer Generation - A coherent response is formulated
Source Citation - Supporting memories and mental models are identified

Request Parameters

Parameter	Type	Required	Description
`bank_id`	string	Yes	Memory bank to query (in URL path)
`query`	string	Yes	Question to answer
`context`	string	No	Additional context for the question
`budget`	string	No	Search depth: "low", "mid", "high" (default: "low")
`max_tokens`	integer	No	Max tokens in response (default: 4096)
`response_schema`	object	No	JSON Schema for structured output

Response

{
  "text": "Based on the stored memories, the key priorities for the project are: 1) Completing the user authentication module by March 15th, 2) Improving API response times to under 200ms, and 3) Adding support for dark mode as requested by multiple users.",
  "based_on": [],
  "mental_models": [
    {
      "id": "mm_abc123",
      "text": "The project is focused on building a modern web application..."
    }
  ],
  "structured_output": null,
  "usage": {
    "input_tokens": 3352,
    "output_tokens": 806,
    "total_tokens": 4158
  }
}

Field	Description
`text`	AI-generated response to your question
`based_on`	Memories used to form the answer
`mental_models`	Mental models that were used as context (if any)
`structured_output`	Parsed JSON if response_schema was provided
`usage`	Token consumption breakdown

Reflect vs. Recall

Aspect	Recall	Reflect
Output	Raw memories	Synthesized answer
Processing	Search only	AI reasoning
Best for	Getting context	Answering questions
Token usage	Lower	Higher
Sources	The result IS the sources	Sources cited separately

When to Use Recall

You need raw data to process yourself
Building prompts for another AI system
Debugging or inspecting stored memories
Minimizing token usage

When to Use Reflect

Answering user questions directly
Generating summaries or insights
Need synthesized information from multiple memories
Want automatic source citation

Question Best Practices

Ask Clear Questions

Good
Less Effective

response = client.reflect(
    bank_id=bank_id,
    question="What communication preferences has the user expressed?"
)

response = client.reflect(
    bank_id=bank_id,
    question="preferences?"
)

Provide Context When Helpful

response = client.reflect(
    bank_id=bank_id,
    question="What should I know before our meeting?",
    context="We're meeting with the client to discuss the Q2 roadmap"
)

Be Specific

Good
Less Effective

response = client.reflect(
    bank_id=bank_id,
    question="What technical requirements were mentioned for the mobile app?"
)

response = client.reflect(
    bank_id=bank_id,
    question="What are the requirements?"
)

Using in the UI

The Reflect view in memory banks provides an interactive chat interface:

Navigate to your memory bank
Click Reflect in the sidebar
Type your question
View the AI-generated answer
Examine source citations

This is useful for:

Exploring what's stored in a memory bank
Testing questions before API integration
Understanding how memories inform answers

Advanced Patterns

Conversational Context

Build on previous answers:

# First question
resp1 = client.reflect(
    bank_id=bank_id,
    question="Who are the key stakeholders?"
)

# Follow-up with context
resp2 = client.reflect(
    bank_id=bank_id,
    question="What are their main concerns?",
    context=f"We identified these stakeholders: {resp1.answer}"
)

Combining Operations

Use Recall for context, Reflect for synthesis:

# Get relevant context
context_memories = client.recall(
    bank_id=bank_id,
    query="project history",
    limit=5
)

# Ask a focused question with context
context_text = "\n".join([m.content for m in context_memories])
response = client.reflect(
    bank_id=bank_id,
    question="What lessons were learned from past projects?",
    context=context_text
)

Handling Uncertainty

Reflect can indicate when information is lacking:

response = client.reflect(
    bank_id=bank_id,
    question="What is the project budget?"
)

# Response might be:
# "I don't have information about the project budget in the stored memories."

Token Usage

Reflect operations typically use more tokens than Recall because:

AI reasoning is computationally intensive
Multiple memories are processed together
Answer generation requires additional tokens

Monitor usage on the Usage Analytics page.

Source Citation

Every Reflect response includes sources:

response = client.reflect(bank_id=bank_id, question="...")

for source in response.sources:
    print(f"Memory: {source.content}")
    print(f"Relevance: {source.relevance}")

Use sources to:

Verify answer accuracy
Provide transparency to users
Link back to original information
Debug unexpected answers

Error Handling

Python
TypeScript

try:
    response = client.reflect(bank_id=bank_id, query=query)
    print(response.text)
except Exception as e:
    print(f"Error: {e}")

try {
  const response = await client.reflect(bankId, query);
  console.log(response.text);
} catch (error) {
  console.error('Error:', error.message);
}

Common Errors

Error	Cause	Solution
401 Unauthorized	Invalid API key	Check your API key
402 Payment Required	Insufficient credits	Add credits to your account
404 Not Found	Invalid bank_id	Verify the bank exists
400 Bad Request	Empty question	Provide a question

Performance Considerations

Questions are more expensive - Reflect uses more tokens than Recall
Limit sources - Use max_sources to control memory retrieval
Cache when possible - Store answers for frequently asked questions
Consider Recall first - If raw memories suffice, prefer Recall

Overview​

Basic Usage​

How It Works​

Request Parameters​

Response​

Reflect vs. Recall​

When to Use Recall​

When to Use Reflect​

Question Best Practices​

Ask Clear Questions​

Provide Context When Helpful​

Be Specific​

Using in the UI​

Advanced Patterns​

Conversational Context​

Combining Operations​

Handling Uncertainty​

Token Usage​

Source Citation​

Error Handling​

Common Errors​

Performance Considerations​