Memory-Driven Refactoring (Python)

The Hidden Cost of Inconsistent Refactoring

Refactoring is supposed to improve code quality. But when AI agents refactor without memory, they introduce a subtle problem: inconsistency. One function gets renamed to snake_case, another stays camelCase. One module uses dataclasses, another uses plain dicts. The codebase slowly fragments.

Memory-driven refactoring solves this by giving your AI agent access to established patterns, past decisions, and project conventions. The result? Consistent transformations across your entire codebase—and dramatic productivity gains.

How Memory Enables Consistent Patterns

When an AI agent refactors code, it needs to answer questions like: "How should I name this?" "What pattern does this project use for error handling?" "Should I use type hints here?" Without memory, it guesses. With memory, it knows.

Let's see how to implement a memory-aware refactoring agent in Python.

Step 1: Define Pattern Memories

First, we create a structure for storing coding patterns the agent should follow:

from pydantic import BaseModel
from datetime import datetime
from enum import Enum

class PatternType(str, Enum):
    NAMING = "naming"
    ERROR_HANDLING = "error_handling"
    TYPING = "typing"
    STRUCTURE = "structure"
    IMPORTS = "imports"

class PatternMemory(BaseModel):
    pattern_type: PatternType
    description: str        # Human-readable rule
    example_before: str     # What the pattern transforms
    example_after: str      # What it transforms into
    project: str | None     # Project-scoped or global
    priority: int = 1       # Higher = stronger preference
    created_at: datetime
    
    def to_prompt(self) -> str:
        return f"""Pattern ({self.pattern_type}): {self.description}
Before: {self.example_before}
After: {self.example_after}"""

Step 2: Build the Memory Store

The memory store retrieves relevant patterns based on what kind of refactoring is happening:

class PatternMemoryStore:
    def __init__(self, codemem_client):
        self.client = codemem_client
    
    async def get_patterns(
        self, 
        pattern_types: list[PatternType],
        project: str | None = None
    ) -> list[PatternMemory]:
        """Retrieve patterns relevant to the refactoring task."""
        memories = await self.client.search(
            tags=[pt.value for pt in pattern_types],
            project=project,
            limit=10
        )
        
        patterns = [PatternMemory(**m.metadata) for m in memories]
        # Sort by priority, project-specific first
        return sorted(
            patterns,
            key=lambda p: (p.project == project, p.priority),
            reverse=True
        )
    
    async def learn_pattern(
        self,
        pattern: PatternMemory
    ) -> None:
        """Store a new pattern from user feedback."""
        await self.client.store(
            content=pattern.description,
            tags=[pattern.pattern_type.value, "refactoring"],
            metadata=pattern.model_dump(),
            project=pattern.project
        )

Step 3: Create the Refactoring Agent

Now we wire memory into the refactoring workflow. The agent loads relevant patterns before transforming any code:

class MemoryDrivenRefactorer:
    def __init__(self, llm, memory_store: PatternMemoryStore):
        self.llm = llm
        self.memory = memory_store
    
    async def refactor(
        self,
        code: str,
        intent: str,  # e.g., "convert to dataclass", "add type hints"
        project: str
    ) -> str:
        # 1. Detect what pattern types are relevant
        pattern_types = self._detect_pattern_types(intent)
        
        # 2. Load patterns from memory
        patterns = await self.memory.get_patterns(pattern_types, project)
        
        # 3. Build context-aware prompt
        pattern_context = "\n\n".join(p.to_prompt() for p in patterns)
        
        prompt = f"""You are refactoring Python code. Follow these established patterns:

{pattern_context}

Intent: {intent}

Code to refactor:
```python
{code}
```

Apply the patterns consistently. Return only the refactored code."""

        # 4. Execute with pattern awareness
        result = await self.llm.generate(prompt)
        return result
    
    def _detect_pattern_types(self, intent: str) -> list[PatternType]:
        """Map refactoring intent to relevant pattern types."""
        mapping = {
            "rename": [PatternType.NAMING],
            "type": [PatternType.TYPING],
            "error": [PatternType.ERROR_HANDLING],
            "dataclass": [PatternType.STRUCTURE, PatternType.TYPING],
            "import": [PatternType.IMPORTS],
        }
        return [pt for key, pts in mapping.items() 
                if key in intent.lower() for pt in pts]

Learning from Corrections

The real power comes when the agent learns from user corrections. When a developer fixes an AI refactoring, that correction becomes a new pattern:

async def learn_from_correction(
    self,
    original: str,
    ai_output: str,
    user_correction: str,
    project: str
) -> None:
    """Extract and store a pattern from user feedback."""
    
    # Ask LLM to identify the pattern difference
    analysis = await self.llm.generate(f"""
    The AI produced this refactoring:
    {ai_output}
    
    The user corrected it to:
    {user_correction}
    
    What pattern rule explains this correction?
    Respond with: pattern_type, description, before_example, after_example
    """)
    
    pattern = self._parse_pattern(analysis, project)
    await self.memory.learn_pattern(pattern)

Productivity Gains in Practice

Memory-driven refactoring delivers compounding returns:

First refactoring: Agent uses baseline patterns—still requires some corrections
After 10 corrections: Agent has learned project-specific conventions—fewer fixes needed
After 50 corrections: Agent matches team style almost perfectly—near-autonomous operation

The key insight is that every correction makes future refactorings better. Instead of repeating the same feedback, you're training a pattern library that persists across sessions and team members.

Pattern Categories to Track

Start by capturing these high-impact patterns:

Naming conventions: get_user() vs fetch_user() vs load_user()
Error handling: Custom exceptions vs result types vs early returns
Type patterns: list[str] vs List[str], optional handling
Import style: Absolute vs relative, grouping order
Docstring format: Google style vs NumPy vs Sphinx

Start Refactoring Smarter

Memory-driven refactoring transforms AI from a tool that requires constant supervision into a partner that learns your team's patterns. The agent gets better with every interaction, and your codebase stays consistent.

With CodeMem, you get production-ready memory infrastructure—pattern storage, semantic search, and project scoping—so you can focus on building your refactoring workflows instead of reinventing memory systems.

Get started with CodeMem for free →