MCP in Go: Production Readiness Checklist

You've built your MCP server in Go. It works on your machine. Now comes the hard part: running it in production where real AI agents depend on it 24/7. This checklist covers everything you need to verify before deploying—from security hardening to graceful shutdown.

Print this out. Check every box. Your future self (and your on-call rotation) will thank you.

🔒 Security Checklist

☐ TLS Enabled: All traffic encrypted. Use http.ListenAndServeTLS() or terminate TLS at your load balancer.
☐ Authentication Required: Every request validated with Bearer tokens or API keys. No anonymous access to tools.
☐ Secrets Management: No hardcoded credentials. Use environment variables or a secrets manager like Vault.
☐ Input Validation: All tool inputs sanitized. Never trust data from AI agents—they can be manipulated.
☐ Rate Limiting: Per-user and per-IP limits configured. Protects against runaway agents and abuse.
☐ CORS Configured: If browser-accessible, whitelist only trusted origins.

// Security essentials in your main.go
server := &http.Server{
    Addr:         ":8080",
    Handler:      rateLimitMiddleware(authMiddleware(mcpHandler)),
    ReadTimeout:  10 * time.Second,  // Prevent slowloris
    WriteTimeout: 30 * time.Second,
    IdleTimeout:  60 * time.Second,
    TLSConfig: &tls.Config{
        MinVersion: tls.VersionTLS12,
    },
}

📊 Observability Checklist

☐ Structured Logging: Use slog with JSON output. Include request IDs for tracing.
☐ Metrics Exposed: Prometheus endpoint at /metrics. Track request count, latency percentiles, error rates.
☐ Health Endpoint: /health returns 200 when healthy, checks dependencies (DB, cache).
☐ Readiness Endpoint: /ready returns 200 only when ready to serve traffic.
☐ Distributed Tracing: OpenTelemetry configured if running multiple services.
☐ Alerting Rules: Alerts for error rate spikes, latency degradation, and service unavailability.

// Essential metrics to track
var (
    requestsTotal = prometheus.NewCounterVec(
        prometheus.CounterOpts{Name: "mcp_requests_total"},
        []string{"method", "status"},
    )
    requestDuration = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Name:    "mcp_request_duration_seconds",
            Buckets: []float64{.005, .01, .025, .05, .1, .25, .5, 1},
        },
        []string{"method"},
    )
)

🛡️ Reliability Checklist

☐ Graceful Shutdown: Handle SIGTERM, drain connections before exit. Kubernetes gives 30s by default.
☐ Context Timeouts: All external calls (DB, APIs) have context deadlines. No infinite hangs.
☐ Retry Logic: Exponential backoff for transient failures. Use github.com/cenkalti/backoff.
☐ Circuit Breakers: Prevent cascading failures when dependencies are down.
☐ Connection Pooling: Database connections pooled and limited. Don't exhaust DB connections.
☐ Panic Recovery: Middleware catches panics, logs stack trace, returns 500 instead of crashing.

// Graceful shutdown pattern
func main() {
    server := &http.Server{Addr: ":8080", Handler: handler}
    
    go func() {
        if err := server.ListenAndServe(); err != http.ErrServerClosed {
            log.Fatal(err)
        }
    }()
    
    quit := make(chan os.Signal, 1)
    signal.Notify(quit, syscall.SIGTERM, syscall.SIGINT)
    <-quit
    
    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()
    
    if err := server.Shutdown(ctx); err != nil {
        log.Printf("forced shutdown: %v", err)
    }
}

⚡ Performance Checklist

☐ Resource Limits: Memory and CPU limits set in container/deployment config.
☐ GOMAXPROCS: Set to container CPU limit with go.uber.org/automaxprocs.
☐ Connection Reuse: HTTP clients reuse connections. Don't create new clients per request.
☐ JSON Optimization: Use github.com/json-iterator/go for hot paths if needed.
☐ Load Tested: Benchmarked under expected load. Know your breaking point.
☐ Profiling Ready: pprof endpoints available (protected!) for production debugging.

🚀 Deployment Checklist

☐ Container Image: Multi-stage build, scratch/distroless base, non-root user.
☐ Replicas: At least 2 instances for high availability.
☐ Rolling Updates: Zero-downtime deployments configured. MaxUnavailable=0.
☐ Liveness Probe: Kubernetes knows when to restart unhealthy pods.
☐ Readiness Probe: Traffic only routed to ready pods.
☐ Rollback Plan: Know how to quickly revert to previous version.

# Minimal production Dockerfile
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.* ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o server .

FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/server /server
USER nonroot:nonroot
EXPOSE 8080
ENTRYPOINT ["/server"]

The Bottom Line

Production readiness isn't about perfection—it's about knowing your system's boundaries and having visibility when things go wrong. This checklist won't prevent every incident, but it will ensure you can detect problems quickly and recover gracefully.

Start with security and observability. Add reliability patterns as you scale. And always, always test your graceful shutdown—it's the one thing that bites everyone in production.

Skip the Checklist. Ship Today.

CodeMem handles the production complexity for you—authentication, rate limiting, observability, and high availability out of the box. Focus on building AI agents, not infrastructure.

Start Building for Free →