Free 40-page Claude guide — setup, 120 prompt codes, MCP servers, AI agents. Download free →
CLSkills
DevOps & CI/CDadvanced

Canary Deploy

Share

Set up canary deployment with gradual rollout

Works with OpenClaude

You are a DevOps engineer setting up canary deployments. The user wants to implement gradual traffic shifting to a new version with automated rollback on error metrics.

What to check first

  • Kubernetes version supports Fluxcd or ArgoCD for GitOps
  • Service mesh installed (Istio, Linkerd, or Consul) with VirtualService/TrafficPolicy support
  • Prometheus scraping metrics from your application (request latency, error rate, custom metrics)
  • Current deployment manifest with resource requests/limits defined
  • Load testing tool available (k6, locust, or Apache JMeter)

Steps

  1. Install Fluxcd with Canary CRD support: flux bootstrap github --owner=YOUR_ORG --repo=YOUR_REPO --personal --path=clusters/my-cluster
  2. Add Fluxcd helm-controller and notification-controller for automated promotion decisions
  3. Create a Canary resource defining traffic weights, analysis window, and success criteria thresholds
  4. Configure Prometheus queries for error rate, latency p99, and custom business metrics
  5. Set up alerts to trigger rollback if metrics exceed thresholds (e.g., error_rate > 5%)
  6. Run smoke tests against canary endpoint before proceeding to next weight increment
  7. Monitor canary metrics in real-time dashboard and validate before manual or automatic promotion
  8. Execute progressive rollout: 10% → 25% → 50% → 100% with 5-minute analysis windows between stages

Code

# fluxcd-canary-deployment.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: canary-demo
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-stable
  namespace: canary-demo
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
      version: stable
  template:
    metadata:
      labels:
        app: myapp
        version: stable
    spec:
      containers:
      - name: myapp
        image: myregistry.azurecr.io/myapp:v1.0.0
        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "128Mi"
            cpu: "200m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
---
apiVersion: fluxcd.io/v1beta1
kind: Canary
metadata:
  name: myapp-canary
  namespace: canary-demo
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app-stable
  skipAnalysis: false
  progressDeadlineSeconds: 600
  service:

Note: this example was truncated in the source. See the GitHub repo for the latest full version.

Common Pitfalls

  • Treating this skill as a one-shot solution — most workflows need iteration and verification
  • Skipping the verification steps — you don't know it worked until you measure
  • Applying this skill without understanding the underlying problem — read the related docs first

When NOT to Use This Skill

  • When a simpler manual approach would take less than 10 minutes
  • On critical production systems without testing in staging first
  • When you don't have permission or authorization to make these changes

How to Verify It Worked

  • Run the verification steps documented above
  • Compare the output against your expected baseline
  • Check logs for any warnings or errors — silent failures are the worst kind

Production Considerations

  • Test in staging before deploying to production
  • Have a rollback plan — every change should be reversible
  • Monitor the affected systems for at least 24 hours after the change

Quick Info

Difficultyadvanced
Version1.0.0
AuthorClaude Skills Hub
devopscanarydeployment

Install command:

curl -o ~/.claude/skills/canary-deploy.md https://claude-skills-hub.vercel.app/skills/devops/canary-deploy.md

Related DevOps & CI/CD Skills

Other Claude Code skills in the same category — free to download.

Want a DevOps & CI/CD skill personalized to YOUR project?

This is a generic skill that works for everyone. Our AI can generate one tailored to your exact tech stack, naming conventions, folder structure, and coding patterns — with 3x more detail.