go middleware patterns production

Go Patterns for MCP Server Middleware

Learn how to build production-ready MCP servers in Go using middleware patterns for logging, authentication, and rate limiting.

CodeMem Team

Why Middleware Matters for MCP Servers

Building an MCP server is one thing—running it in production is another. When AI agents start hammering your server with requests, you need logging to debug issues, authentication to prevent abuse, and rate limiting to keep costs under control.

Go's composable HTTP handler pattern makes middleware a natural fit. Instead of polluting your business logic with cross-cutting concerns, you wrap handlers in layers that each handle one responsibility. Let's build a production-ready middleware stack for MCP servers.

The Middleware Pattern in Go

In Go, middleware is simply a function that takes an http.Handler and returns a new http.Handler. This allows you to chain behaviors:

// Middleware is a function that wraps an http.Handler
type Middleware func(http.Handler) http.Handler

// Chain applies middlewares in order (first middleware runs first)
func Chain(h http.Handler, middlewares ...Middleware) http.Handler {
    for i := len(middlewares) - 1; i >= 0; i-- {
        h = middlewares[i](h)
    }
    return h
}

// Usage
handler := Chain(mcpHandler,
    WithLogging(logger),
    WithAuth(authService),
    WithRateLimit(100, time.Minute),
)

With this foundation, requests flow through logging, then auth, then rate limiting, before reaching your MCP handler. Responses flow back through in reverse order.

Logging Middleware: Know What's Happening

Structured logging is essential for debugging MCP interactions. You want to capture the method being called, timing information, and any errors—without logging sensitive data like API keys or memory contents.

func WithLogging(logger *slog.Logger) Middleware {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            start := time.Now()
            
            // Wrap response writer to capture status code
            wrapped := &responseWriter{ResponseWriter: w, status: 200}
            
            // Extract request ID for tracing
            reqID := r.Header.Get("X-Request-ID")
            if reqID == "" {
                reqID = uuid.NewString()[:8]
            }
            
            // Add to context for downstream use
            ctx := context.WithValue(r.Context(), "request_id", reqID)
            
            next.ServeHTTP(wrapped, r.WithContext(ctx))
            
            logger.Info("mcp request",
                "request_id", reqID,
                "method", r.Method,
                "path", r.URL.Path,
                "status", wrapped.status,
                "duration_ms", time.Since(start).Milliseconds(),
                "user_agent", r.UserAgent(),
            )
        })
    }
}

type responseWriter struct {
    http.ResponseWriter
    status int
}

func (rw *responseWriter) WriteHeader(code int) {
    rw.status = code
    rw.ResponseWriter.WriteHeader(code)
}

Authentication Middleware: Control Access

MCP servers typically use Bearer tokens for authentication. Your middleware should validate tokens and inject user context for downstream handlers:

type AuthService interface {
    ValidateToken(ctx context.Context, token string) (*User, error)
}

func WithAuth(auth AuthService) Middleware {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            token := extractBearerToken(r)
            if token == "" {
                writeJSONRPCError(w, -32001, "missing authorization")
                return
            }
            
            user, err := auth.ValidateToken(r.Context(), token)
            if err != nil {
                writeJSONRPCError(w, -32001, "invalid token")
                return
            }
            
            // Inject user into context
            ctx := context.WithValue(r.Context(), "user", user)
            next.ServeHTTP(w, r.WithContext(ctx))
        })
    }
}

func extractBearerToken(r *http.Request) string {
    auth := r.Header.Get("Authorization")
    if !strings.HasPrefix(auth, "Bearer ") {
        return ""
    }
    return strings.TrimPrefix(auth, "Bearer ")
}

func writeJSONRPCError(w http.ResponseWriter, code int, msg string) {
    w.Header().Set("Content-Type", "application/json")
    w.WriteHeader(http.StatusUnauthorized)
    json.NewEncoder(w).Encode(map[string]interface{}{
        "jsonrpc": "2.0",
        "error":   map[string]interface{}{"code": code, "message": msg},
        "id":      nil,
    })
}

Rate Limiting: Protect Your Resources

AI agents can be chatty. Without rate limiting, a single runaway agent could exhaust your resources or rack up unexpected costs. Go's golang.org/x/time/rate package provides an efficient token bucket implementation:

import "golang.org/x/time/rate"

type RateLimiter struct {
    limiters sync.Map // map[string]*rate.Limiter
    rate     rate.Limit
    burst    int
}

func NewRateLimiter(requests int, per time.Duration) *RateLimiter {
    return &RateLimiter{
        rate:  rate.Limit(float64(requests) / per.Seconds()),
        burst: requests,
    }
}

func (rl *RateLimiter) getLimiter(key string) *rate.Limiter {
    if v, ok := rl.limiters.Load(key); ok {
        return v.(*rate.Limiter)
    }
    limiter := rate.NewLimiter(rl.rate, rl.burst)
    rl.limiters.Store(key, limiter)
    return limiter
}

func WithRateLimit(requests int, per time.Duration) Middleware {
    rl := NewRateLimiter(requests, per)
    
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            // Rate limit by user ID if authenticated, otherwise by IP
            var key string
            if user, ok := r.Context().Value("user").(*User); ok {
                key = "user:" + user.ID
            } else {
                key = "ip:" + r.RemoteAddr
            }
            
            if !rl.getLimiter(key).Allow() {
                w.Header().Set("Retry-After", "60")
                writeJSONRPCError(w, -32000, "rate limit exceeded")
                return
            }
            
            next.ServeHTTP(w, r)
        })
    }
}

Putting It Together

Here's how to wire up your production MCP server with the complete middleware stack:

func main() {
    logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
    authService := NewAuthService(os.Getenv("JWT_SECRET"))
    
    mcpHandler := NewMCPHandler(/* your tools */)
    
    handler := Chain(mcpHandler,
        WithLogging(logger),
        WithAuth(authService),
        WithRateLimit(100, time.Minute),
    )
    
    server := &http.Server{
        Addr:         ":8080",
        Handler:      handler,
        ReadTimeout:  10 * time.Second,
        WriteTimeout: 30 * time.Second,
    }
    
    logger.Info("starting MCP server", "addr", server.Addr)
    log.Fatal(server.ListenAndServe())
}

Production Considerations

  • Order matters: Logging should be first to capture all requests, including rejected ones
  • Context propagation: Use context.Context to pass request-scoped data between middlewares
  • Graceful shutdown: Use server.Shutdown() to drain in-flight requests
  • Health checks: Exempt /health endpoints from auth and rate limiting
  • Metrics: Add Prometheus middleware to track latencies and error rates

Build Production MCP Servers Faster

Focus on your AI agent's logic, not infrastructure. CodeMem handles authentication, rate limiting, and persistence out of the box—so you can ship memory-powered agents today.

Start Building for Free →