Reflect: Reasoning Over Memories
The Reflect operation performs agentic reasoning over stored memories, guided by the bank's mission, directives, and disposition traits. Unlike Recall which returns raw memories, Reflect synthesizes information and draws conclusions with confidence scores.
Overview
Reflect enables reasoning over memories:
- Synthesize insights from stored memories
- Reasoning influenced by disposition traits (skepticism, literalism, empathy)
- Confidence scores based on evidence strength
- Cited sources for transparency
- Automatically incorporates relevant mental models as additional context
Basic Usage
- Python
- TypeScript
- cURL
from hindsight_client import Hindsight
client = Hindsight(
base_url="https://api.hindsight.vectorize.io",
api_key="your-api-key"
)
# Ask a question
response = client.reflect(
bank_id="your-bank-id",
query="What are the key priorities for the project?"
)
print(response.text)
print("Based on:", response.based_on)
import { HindsightClient } from '@vectorize-io/hindsight-client';
const client = new HindsightClient({
baseUrl: 'https://api.hindsight.vectorize.io',
apiKey: 'your-api-key'
});
// Ask a question
const response = await client.reflect(
'your-bank-id',
'What are the key priorities for the project?'
);
console.log(response.text);
console.log('Based on:', response.based_on);
curl -X POST https://api.hindsight.vectorize.io/v1/default/banks/{bank_id}/reflect \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"query": "What are the key priorities for the project?"
}'
How It Works
When you call Reflect:
- Question Analysis - The AI understands your question
- Memory Retrieval - Relevant memories are automatically recalled using TEMPR (semantic, keyword, graph, temporal search)
- Mental Model Injection - Relevant mental models are included as additional context
- Reasoning - The AI synthesizes information from memories and mental models, influenced by the bank's disposition traits
- Answer Generation - A coherent response is formulated
- Source Citation - Supporting memories and mental models are identified
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
bank_id | string | Yes | Memory bank to query (in URL path) |
query | string | Yes | Question to answer |
context | string | No | Additional context for the question |
budget | string | No | Search depth: "low", "mid", "high" (default: "low") |
max_tokens | integer | No | Max tokens in response (default: 4096) |
response_schema | object | No | JSON Schema for structured output |
Response
{
"text": "Based on the stored memories, the key priorities for the project are: 1) Completing the user authentication module by March 15th, 2) Improving API response times to under 200ms, and 3) Adding support for dark mode as requested by multiple users.",
"based_on": [],
"mental_models": [
{
"id": "mm_abc123",
"text": "The project is focused on building a modern web application..."
}
],
"structured_output": null,
"usage": {
"input_tokens": 3352,
"output_tokens": 806,
"total_tokens": 4158
}
}
| Field | Description |
|---|---|
text | AI-generated response to your question |
based_on | Memories used to form the answer |
mental_models | Mental models that were used as context (if any) |
structured_output | Parsed JSON if response_schema was provided |
usage | Token consumption breakdown |
Reflect vs. Recall
| Aspect | Recall | Reflect |
|---|---|---|
| Output | Raw memories | Synthesized answer |
| Processing | Search only | AI reasoning |
| Best for | Getting context | Answering questions |
| Token usage | Lower | Higher |
| Sources | The result IS the sources | Sources cited separately |
When to Use Recall
- You need raw data to process yourself
- Building prompts for another AI system
- Debugging or inspecting stored memories
- Minimizing token usage
When to Use Reflect
- Answering user questions directly
- Generating summaries or insights
- Need synthesized information from multiple memories
- Want automatic source citation
Question Best Practices
Ask Clear Questions
- Good
- Less Effective
response = client.reflect(
bank_id=bank_id,
question="What communication preferences has the user expressed?"
)
response = client.reflect(
bank_id=bank_id,
question="preferences?"
)
Provide Context When Helpful
response = client.reflect(
bank_id=bank_id,
question="What should I know before our meeting?",
context="We're meeting with the client to discuss the Q2 roadmap"
)
Be Specific
- Good
- Less Effective
response = client.reflect(
bank_id=bank_id,
question="What technical requirements were mentioned for the mobile app?"
)
response = client.reflect(
bank_id=bank_id,
question="What are the requirements?"
)
Using in the UI
The Reflect view in memory banks provides an interactive chat interface:
- Navigate to your memory bank
- Click Reflect in the sidebar
- Type your question
- View the AI-generated answer
- Examine source citations
This is useful for:
- Exploring what's stored in a memory bank
- Testing questions before API integration
- Understanding how memories inform answers
Advanced Patterns
Conversational Context
Build on previous answers:
# First question
resp1 = client.reflect(
bank_id=bank_id,
question="Who are the key stakeholders?"
)
# Follow-up with context
resp2 = client.reflect(
bank_id=bank_id,
question="What are their main concerns?",
context=f"We identified these stakeholders: {resp1.answer}"
)
Combining Operations
Use Recall for context, Reflect for synthesis:
# Get relevant context
context_memories = client.recall(
bank_id=bank_id,
query="project history",
limit=5
)
# Ask a focused question with context
context_text = "\n".join([m.content for m in context_memories])
response = client.reflect(
bank_id=bank_id,
question="What lessons were learned from past projects?",
context=context_text
)
Handling Uncertainty
Reflect can indicate when information is lacking:
response = client.reflect(
bank_id=bank_id,
question="What is the project budget?"
)
# Response might be:
# "I don't have information about the project budget in the stored memories."
Token Usage
Reflect operations typically use more tokens than Recall because:
- AI reasoning is computationally intensive
- Multiple memories are processed together
- Answer generation requires additional tokens
Monitor usage on the Usage Analytics page.
Source Citation
Every Reflect response includes sources:
response = client.reflect(bank_id=bank_id, question="...")
for source in response.sources:
print(f"Memory: {source.content}")
print(f"Relevance: {source.relevance}")
Use sources to:
- Verify answer accuracy
- Provide transparency to users
- Link back to original information
- Debug unexpected answers
Error Handling
- Python
- TypeScript
try:
response = client.reflect(bank_id=bank_id, query=query)
print(response.text)
except Exception as e:
print(f"Error: {e}")
try {
const response = await client.reflect(bankId, query);
console.log(response.text);
} catch (error) {
console.error('Error:', error.message);
}
Common Errors
| Error | Cause | Solution |
|---|---|---|
| 401 Unauthorized | Invalid API key | Check your API key |
| 402 Payment Required | Insufficient credits | Add credits to your account |
| 404 Not Found | Invalid bank_id | Verify the bank exists |
| 400 Bad Request | Empty question | Provide a question |
Performance Considerations
- Questions are more expensive - Reflect uses more tokens than Recall
- Limit sources - Use
max_sourcesto control memory retrieval - Cache when possible - Store answers for frequently asked questions
- Consider Recall first - If raw memories suffice, prefer Recall