How to Design Memory Schemas (Python)
Learn how to structure AI memory schemas in Python for preferences, decisions, context, and facts.
Why Memory Schema Design Matters
When building AI agents with persistent memory, how you structure that memory is just as important as what you store. A well-designed schema enables efficient retrieval, prevents data duplication, and helps your agent provide contextually relevant responses.
In this tutorial, we'll explore four fundamental memory types and show you how to model them in Python using Pydantic for type safety and validation.
The Four Memory Types
Not all memories are equal. Understanding the nature of what you're storing helps you design better retrieval strategies and expiration policies.
1. Preference Memories
Preferences are user-defined choices that guide how the agent behaves. They're relatively stable and should be retrieved frequently.
# Examples: "I prefer TypeScript", "Use 2-space indentation", "Always use dark mode"
class PreferenceMemory(BaseModel):
category: str # e.g., "coding_style", "language", "tooling"
key: str # e.g., "indentation", "framework"
value: str # e.g., "2 spaces", "React"
strength: float = 1.0 # 0.0-1.0, how strongly held
created_at: datetime
last_used: datetime 2. Decision Memories
Decisions capture architectural or design choices made in a specific context. They include reasoning and are crucial for maintaining consistency.
# Examples: "We chose PostgreSQL for ACID compliance", "Using microservices for scaling"
class DecisionMemory(BaseModel):
decision: str # What was decided
reasoning: str # Why this choice was made
alternatives: list[str] = [] # What else was considered
context: str # Project or scope
decided_at: datetime
decided_by: str # User or agent 3. Context Memories
Context memories are ephemeral information about the current state of work. They have shorter lifespans and are scoped to specific sessions or tasks.
# Examples: "Currently refactoring auth module", "User is debugging login flow"
class ContextMemory(BaseModel):
scope: str # "session", "task", "project"
description: str # What's happening
active: bool = True # Is this still relevant?
started_at: datetime
expires_at: datetime | None # Auto-cleanup
related_files: list[str] = [] 4. Fact Memories
Facts are objective pieces of information that don't change based on preference. They're reference data the agent can rely on.
# Examples: "API endpoint is /v2/users", "Database runs on port 5432"
class FactMemory(BaseModel):
subject: str # What this fact is about
predicate: str # The relationship or property
object: str # The value
source: str # Where this info came from
verified: bool = False # Has this been confirmed?
recorded_at: datetime Unified Memory Schema
In practice, you'll want a unified schema that can represent all memory types while enabling efficient querying. Here's a production-ready pattern:
from enum import Enum
from datetime import datetime
from pydantic import BaseModel, Field
from typing import Any
class MemoryType(str, Enum):
PREFERENCE = "preference"
DECISION = "decision"
CONTEXT = "context"
FACT = "fact"
class Memory(BaseModel):
id: str = Field(default_factory=lambda: str(uuid4()))
type: MemoryType
content: str # Human-readable description
metadata: dict[str, Any] = {} # Type-specific fields
tags: list[str] = [] # For filtering
project: str | None = None # Scope to project
embedding: list[float] | None = None # For semantic search
created_at: datetime = Field(default_factory=datetime.utcnow)
updated_at: datetime = Field(default_factory=datetime.utcnow)
expires_at: datetime | None = None
class Config:
use_enum_values = True Schema Design Patterns
Pattern 1: Layered Metadata
Keep common fields at the top level and use metadata for type-specific attributes. This enables unified querying while preserving flexibility.
Pattern 2: Semantic Embeddings
Store vector embeddings alongside memories for semantic search. This lets you find related memories even when keywords don't match exactly.
async def add_memory(content: str, type: MemoryType) -> Memory:
embedding = await generate_embedding(content)
memory = Memory(
type=type,
content=content,
embedding=embedding
)
return await store.save(memory) Pattern 3: TTL-Based Cleanup
Context memories should auto-expire. Set expires_at based on memory type—preferences never expire, context expires in hours, decisions might expire in months.
Pattern 4: Conflict Resolution
When memories conflict, use timestamps and specificity to resolve. More recent, more specific memories take precedence.
def resolve_conflicts(memories: list[Memory]) -> Memory:
# Sort by specificity (project-scoped > global) then by recency
return sorted(
memories,
key=lambda m: (m.project is not None, m.updated_at),
reverse=True
)[0] Practical Tips
- Index wisely: Create indexes on
type,project, andtagsfor fast filtering - Validate early: Use Pydantic's validators to catch bad data before storage
- Version your schema: Include a
schema_versionfield for future migrations - Log retrievals: Track which memories are actually used to prune stale data
Ready to Build?
A well-designed memory schema is the foundation of an effective AI agent. With CodeMem, you don't have to build this infrastructure from scratch—we handle storage, embeddings, and retrieval so you can focus on your agent's logic.