go retrieval reliability advanced

Deterministic Memory Retrieval in Go

Learn how to reduce wrong memory retrieval and build confidence in your AI memory system with vector similarity thresholds, filtering strategies, and scoring techniques.

CodeMem Team

The Wrong Memory Problem

If you've built AI applications with vector-based memory retrieval, you've likely encountered this frustrating scenario: your system confidently returns a memory that's completely irrelevant to the query. The user asks about "database connection pooling" and gets back a memory about "swimming pool construction project."

This happens because vector similarity search returns the most similar results—not necessarily relevant results. When your memory store lacks truly matching content, the system still returns something. This article shows you how to build deterministic, high-confidence memory retrieval in Go.

Understanding Similarity Scores

Vector similarity typically uses cosine similarity, returning values between -1 and 1 (or 0 and 1 for normalized vectors). Most developers treat this as a black box, but understanding the score distribution is crucial:

  • 0.95+ — Near-exact semantic match
  • 0.85-0.95 — Strong relevance, same topic
  • 0.70-0.85 — Related, but may be tangential
  • Below 0.70 — Likely noise, proceed with caution

The key insight: never return results below your confidence threshold. It's better to say "no relevant memories found" than to poison your context with wrong information.

Implementing Threshold-Based Retrieval

Here's a production-ready Go implementation with configurable thresholds:

package memory

type RetrievalConfig struct {
    MinSimilarity    float64 // Hard floor, reject below this
    HighConfidence   float64 // Flag as high-confidence above this
    MaxResults       int     // Limit returned memories
    RequireRecency   bool    // Boost recent memories
}

func DefaultConfig() RetrievalConfig {
    return RetrievalConfig{
        MinSimilarity:  0.72,
        HighConfidence: 0.88,
        MaxResults:     5,
        RequireRecency: true,
    }
}

type ScoredMemory struct {
    Memory       Memory
    Similarity   float64
    Confidence   string  // "high", "medium", "low"
    AdjustedScore float64
}

Multi-Factor Scoring

Raw similarity alone isn't enough. Combine multiple signals for robust retrieval:

func (r *Retriever) Score(mem Memory, query string, similarity float64) ScoredMemory {
    score := similarity
    
    // Recency boost: memories from last 7 days get up to 10% boost
    daysSinceUpdate := time.Since(mem.UpdatedAt).Hours() / 24
    if daysSinceUpdate < 7 {
        recencyBoost := 0.10 * (1 - daysSinceUpdate/7)
        score += recencyBoost
    }
    
    // Access frequency: well-used memories are likely valuable
    if mem.AccessCount > 10 {
        score += 0.05
    }
    
    // Exact keyword match bonus
    if containsExactTerms(mem.Content, extractKeyTerms(query)) {
        score += 0.08
    }
    
    // Cap at 1.0
    if score > 1.0 {
        score = 1.0
    }
    
    confidence := "low"
    if similarity >= r.config.HighConfidence {
        confidence = "high"
    } else if similarity >= 0.80 {
        confidence = "medium"
    }
    
    return ScoredMemory{
        Memory:        mem,
        Similarity:    similarity,
        Confidence:    confidence,
        AdjustedScore: score,
    }
}

The Retrieval Pipeline

Putting it all together—a deterministic retrieval function that never returns garbage:

func (r *Retriever) Retrieve(ctx context.Context, query string) ([]ScoredMemory, error) {
    // Step 1: Get candidates from vector store (request more than needed)
    candidates, err := r.vectorStore.Search(ctx, query, r.config.MaxResults*3)
    if err != nil {
        return nil, fmt.Errorf("vector search failed: %w", err)
    }
    
    // Step 2: Filter by minimum threshold
    var filtered []ScoredMemory
    for _, c := range candidates {
        if c.Similarity < r.config.MinSimilarity {
            continue // Hard reject
        }
        scored := r.Score(c.Memory, query, c.Similarity)
        filtered = append(filtered, scored)
    }
    
    // Step 3: Sort by adjusted score
    sort.Slice(filtered, func(i, j int) bool {
        return filtered[i].AdjustedScore > filtered[j].AdjustedScore
    })
    
    // Step 4: Take top N
    if len(filtered) > r.config.MaxResults {
        filtered = filtered[:r.config.MaxResults]
    }
    
    // Step 5: Final validation - require at least one high-confidence result
    // or return empty to avoid polluting context
    hasHighConfidence := false
    for _, m := range filtered {
        if m.Confidence == "high" {
            hasHighConfidence = true
            break
        }
    }
    
    if !hasHighConfidence && len(filtered) > 0 {
        // Log for monitoring, but still return medium-confidence results
        r.logger.Warn("no high-confidence matches", 
            "query", query,
            "best_score", filtered[0].Similarity)
    }
    
    return filtered, nil
}

Contextual Filtering

Add project and tag-based filtering to narrow the search space before vector similarity:

type RetrievalFilter struct {
    ProjectID  *string
    Tags       []string
    CreatedAfter *time.Time
    MemoryTypes  []string
}

func (r *Retriever) RetrieveFiltered(
    ctx context.Context, 
    query string, 
    filter RetrievalFilter,
) ([]ScoredMemory, error) {
    // Pre-filter at the database level
    candidates, err := r.vectorStore.SearchWithFilter(ctx, query, filter, r.config.MaxResults*3)
    if err != nil {
        return nil, err
    }
    
    // Same scoring pipeline...
    return r.scoreAndFilter(candidates, query)
}

Monitoring and Tuning

Track these metrics to continuously improve retrieval quality:

  • Empty result rate — If too high, lower MinSimilarity or improve memory coverage
  • Average similarity score — Baseline for your domain
  • High-confidence ratio — Percentage of queries with at least one high-confidence match
  • User feedback loops — Track when users ignore or correct retrieved memories

Key Takeaways

Building reliable memory retrieval comes down to these principles:

  1. Set hard thresholds — Never return results below your minimum similarity
  2. Use multi-factor scoring — Combine similarity with recency, access patterns, and keyword matching
  3. Filter early — Use project/tag filters to narrow the search space
  4. Prefer silence over noise — Empty results are better than wrong results
  5. Monitor continuously — Track metrics and adjust thresholds based on real usage

Build Reliable Memory Systems with CodeMem

CodeMem handles all of this complexity for you—deterministic retrieval, multi-factor scoring, and project-based filtering are built into our API. Start free and give your AI agents memory they can trust.

Get Started Free →