Go Patterns for MCP Server Middleware
Learn how to build production-ready MCP servers in Go using middleware patterns for logging, authentication, and rate limiting.
Why Middleware Matters for MCP Servers
Building an MCP server is one thing—running it in production is another. When AI agents start hammering your server with requests, you need logging to debug issues, authentication to prevent abuse, and rate limiting to keep costs under control.
Go's composable HTTP handler pattern makes middleware a natural fit. Instead of polluting your business logic with cross-cutting concerns, you wrap handlers in layers that each handle one responsibility. Let's build a production-ready middleware stack for MCP servers.
The Middleware Pattern in Go
In Go, middleware is simply a function that takes an http.Handler and returns a new http.Handler. This allows you to chain behaviors:
// Middleware is a function that wraps an http.Handler
type Middleware func(http.Handler) http.Handler
// Chain applies middlewares in order (first middleware runs first)
func Chain(h http.Handler, middlewares ...Middleware) http.Handler {
for i := len(middlewares) - 1; i >= 0; i-- {
h = middlewares[i](h)
}
return h
}
// Usage
handler := Chain(mcpHandler,
WithLogging(logger),
WithAuth(authService),
WithRateLimit(100, time.Minute),
) With this foundation, requests flow through logging, then auth, then rate limiting, before reaching your MCP handler. Responses flow back through in reverse order.
Logging Middleware: Know What's Happening
Structured logging is essential for debugging MCP interactions. You want to capture the method being called, timing information, and any errors—without logging sensitive data like API keys or memory contents.
func WithLogging(logger *slog.Logger) Middleware {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
// Wrap response writer to capture status code
wrapped := &responseWriter{ResponseWriter: w, status: 200}
// Extract request ID for tracing
reqID := r.Header.Get("X-Request-ID")
if reqID == "" {
reqID = uuid.NewString()[:8]
}
// Add to context for downstream use
ctx := context.WithValue(r.Context(), "request_id", reqID)
next.ServeHTTP(wrapped, r.WithContext(ctx))
logger.Info("mcp request",
"request_id", reqID,
"method", r.Method,
"path", r.URL.Path,
"status", wrapped.status,
"duration_ms", time.Since(start).Milliseconds(),
"user_agent", r.UserAgent(),
)
})
}
}
type responseWriter struct {
http.ResponseWriter
status int
}
func (rw *responseWriter) WriteHeader(code int) {
rw.status = code
rw.ResponseWriter.WriteHeader(code)
} Authentication Middleware: Control Access
MCP servers typically use Bearer tokens for authentication. Your middleware should validate tokens and inject user context for downstream handlers:
type AuthService interface {
ValidateToken(ctx context.Context, token string) (*User, error)
}
func WithAuth(auth AuthService) Middleware {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
token := extractBearerToken(r)
if token == "" {
writeJSONRPCError(w, -32001, "missing authorization")
return
}
user, err := auth.ValidateToken(r.Context(), token)
if err != nil {
writeJSONRPCError(w, -32001, "invalid token")
return
}
// Inject user into context
ctx := context.WithValue(r.Context(), "user", user)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
}
func extractBearerToken(r *http.Request) string {
auth := r.Header.Get("Authorization")
if !strings.HasPrefix(auth, "Bearer ") {
return ""
}
return strings.TrimPrefix(auth, "Bearer ")
}
func writeJSONRPCError(w http.ResponseWriter, code int, msg string) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusUnauthorized)
json.NewEncoder(w).Encode(map[string]interface{}{
"jsonrpc": "2.0",
"error": map[string]interface{}{"code": code, "message": msg},
"id": nil,
})
} Rate Limiting: Protect Your Resources
AI agents can be chatty. Without rate limiting, a single runaway agent could exhaust your resources or rack up unexpected costs. Go's golang.org/x/time/rate package provides an efficient token bucket implementation:
import "golang.org/x/time/rate"
type RateLimiter struct {
limiters sync.Map // map[string]*rate.Limiter
rate rate.Limit
burst int
}
func NewRateLimiter(requests int, per time.Duration) *RateLimiter {
return &RateLimiter{
rate: rate.Limit(float64(requests) / per.Seconds()),
burst: requests,
}
}
func (rl *RateLimiter) getLimiter(key string) *rate.Limiter {
if v, ok := rl.limiters.Load(key); ok {
return v.(*rate.Limiter)
}
limiter := rate.NewLimiter(rl.rate, rl.burst)
rl.limiters.Store(key, limiter)
return limiter
}
func WithRateLimit(requests int, per time.Duration) Middleware {
rl := NewRateLimiter(requests, per)
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Rate limit by user ID if authenticated, otherwise by IP
var key string
if user, ok := r.Context().Value("user").(*User); ok {
key = "user:" + user.ID
} else {
key = "ip:" + r.RemoteAddr
}
if !rl.getLimiter(key).Allow() {
w.Header().Set("Retry-After", "60")
writeJSONRPCError(w, -32000, "rate limit exceeded")
return
}
next.ServeHTTP(w, r)
})
}
} Putting It Together
Here's how to wire up your production MCP server with the complete middleware stack:
func main() {
logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
authService := NewAuthService(os.Getenv("JWT_SECRET"))
mcpHandler := NewMCPHandler(/* your tools */)
handler := Chain(mcpHandler,
WithLogging(logger),
WithAuth(authService),
WithRateLimit(100, time.Minute),
)
server := &http.Server{
Addr: ":8080",
Handler: handler,
ReadTimeout: 10 * time.Second,
WriteTimeout: 30 * time.Second,
}
logger.Info("starting MCP server", "addr", server.Addr)
log.Fatal(server.ListenAndServe())
} Production Considerations
- Order matters: Logging should be first to capture all requests, including rejected ones
- Context propagation: Use
context.Contextto pass request-scoped data between middlewares - Graceful shutdown: Use
server.Shutdown()to drain in-flight requests - Health checks: Exempt
/healthendpoints from auth and rate limiting - Metrics: Add Prometheus middleware to track latencies and error rates
Build Production MCP Servers Faster
Focus on your AI agent's logic, not infrastructure. CodeMem handles authentication, rate limiting, and persistence out of the box—so you can ship memory-powered agents today.
Start Building for Free →