跳到主要内容

Building Policies

Cloud Enterprise

The per-bank policy editor and the security events page are Cloud Enterprise features. Basic has no Console, no per-bank policy storage, and no security_events audit trail. The Basic extension always runs the 44-pattern sensitive_data rule with action redact; there is nothing to configure. The three recipes in this guide all reference Enterprise-only detectors (detect_secrets, base64_decode, llm_screen, prompt_injection, size_anomaly, protected_keys) and the Enterprise-only block action; they apply to Cloud Enterprise banks only.

A Memory Defense policy is a list of rules, each pairing a detector with an action. Policies are configured per bank from the Console, and the security events page provides the audit trail for everything those rules catch.

Building a Policy

The Console workflow for adding a rule:

  1. Pick the bank from the bank selector at the top of the Console.
  2. Open Bank Settings (gear icon).
  3. Find the Memory Defense card.
  4. Click Add rule.
  5. Pick a detector from the dropdown. Only detectors the org has entitlements for appear in the list.
  6. Pick an action: allow, redact, or block. The set of valid actions is constrained by the detector — sensitive_data and the advanced secret detectors (detect_secrets, base64_decode, llm_screen) accept all three; prompt_injection and size_anomaly accept only block or allow; protected_keys is block-only.
  7. Set any detector-specific overrides. The only override currently exposed is size_anomaly.max_size, sized in bytes.
  8. Save.

Rules take effect immediately for new retain calls. Existing memories are not retroactively rescreened.

Three starter policies cover most banks. Each table is a recipe you can replicate rule-for-rule in the Console. Every recipe below relies on Enterprise-only detectors and actions; Basic cannot reproduce them.

Recipe 1: Research bank (Cloud Enterprise). A bank used internally for research with low blast radius if something leaks. The goal is hygiene without paying the LLM screen cost. Requires Cloud Enterprise (uses detect_secrets).

DetectorActionRationale
sensitive_dataredactCatch the obvious credential formats
detect_secretsredactCatch the broader SaaS provider set
size_anomalyblockReject oversized payloads outright

Recipe 2: Production agent bank (Cloud Enterprise). A bank attached to a customer-facing agent. Maximum protection without the LLM screen cost. Requires Cloud Enterprise; every detector listed except sensitive_data is Enterprise-only, and the block action only enforces on Enterprise.

DetectorActionRationale
sensitive_dataredactFirst-line credential scrub
detect_secretsredactProvider coverage
base64_decoderedactCatch encoded credentials in tool output
prompt_injectionblockReject planted instructions before they reach memory
size_anomalyblockReject oversized payloads that signal exhaustion or exfil staging
protected_keysblockPreserve a document's identity: tag across re-submissions

Recipe 3: Multi-tenant SaaS bank (Cloud Enterprise). A bank serving end users in a regulated industry. Every layer enabled.

DetectorActionRationale
sensitive_dataredactFirst line
detect_secretsredactProvider coverage
base64_decoderedactEncoded credential coverage
llm_screenredactConversational secrets in user prose
prompt_injectionblockOWASP ASI06 mitigation
size_anomalyblockReject oversized payloads outright as DoS and exfil staging defense
protected_keysblockLock identity: and compliance: tags against silent re-association

Error Case: Missing Entitlement

If a bank policy includes a rule for a detector the org does not have enabled, the save returns HTTP 400 with the offending detector names listed. The Console surfaces this as "These detectors require an entitlement your organization does not have. Contact your administrator." Owners can grant the entitlement on the Admin Feature Flags page, after which the bank admin can re-save the policy. This guardrail exists so a bank admin cannot quietly reference a detector that is not actually running, which would create a false sense of coverage.

Reading the Audit Trail (Cloud Enterprise)

Cloud Enterprise

The security_events table and the Console Security Events page are Cloud Enterprise features. Basic does not write security events.

Every non-ALLOW decision (REDACT, BLOCK) writes a row to security_events. The Console exposes these rows on the Security Events tab inside the bank navigation. The page is the single source of truth for what Memory Defense has caught and what it did about it.

Columns

  • Detector: the specific detector that fired (GitHub Token, AWS Access Key, prompt_injection, and so on). For the provider-coverage layer, the detector name is the provider-specific format, not the generic detect_secrets label, so routing to the right service owner is direct.
  • Action: redact or block. The action recorded is the one the rule resolved to, not the policy default.
  • Severity: low, medium, high, critical. Severity is set per detector and used by the webhook payload for downstream filtering.
  • Source class: which surface introduced the offending content. One of user_input, agent_authored, external_tool, system, or unknown. Source class is set on every retain item and gives a fast filter for "who introduced this content".
  • Captured fingerprint: event_metadata.hits[].preview carries a redacted-identifiable fingerprint of each captured secret (e.g. ghp_AAAA...BBBB). Never plaintext.
  • Timestamp: when the screen ran.
  • Submitting key (Enterprise): the Hindsight API key name that submitted the retain (api_key_name). Attributes the leak to the specific agent or service that produced it.
  • Identity (Enterprise): the validated identity from the provider whoami probe (GitHub login, AWS Arn, Stripe account id, and so on).

Filters

The Console exposes filters across the top of the page: by detector, by action, by date range, and (on Enterprise) by submitting API key name. Saved filters are scoped per user and persist across sessions.

Drill-In

Clicking any row opens a detail panel with the full event_metadata payload. The panel includes:

  • All hits captured by the detector, with offsets into the original payload
  • The document id of the offending retain so you can pivot to the source memory
  • The source class and the originating session id where available
  • The LLM screen reason (for llm_screen rows), explaining in natural language why the LLM flagged the content
  • The full event metadata, including captured-secret fingerprints and the submitting API key name

Use the drill-in to reconstruct what happened, then send the source class and document id to the responsible owner for cleanup. The Security Events page is read-only by design: it is the forensic record, not a remediation surface. Remediation happens in the Console memory browser or via the retain API.