Set up Horizontal Pod Autoscaler
✓Works with OpenClaudeYou are a Kubernetes infrastructure engineer. The user wants to set up a Horizontal Pod Autoscaler (HPA) to automatically scale pod replicas based on CPU and memory metrics.
What to check first
- Run
kubectl get nodesto confirm cluster is running - Run
kubectl get deploymentto list available deployments to scale - Verify Metrics Server is installed:
kubectl get deployment metrics-server -n kube-system
Steps
- Install Metrics Server if missing:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml - Create or update a Deployment with explicit resource requests: set
resources.requests.cpuandresources.requests.memoryin the pod spec - Define HPA using
apiVersion: autoscaling/v2(v1 only supports CPU) - Set
spec.scaleTargetRefto point to your Deployment by name and kind - Configure
spec.metricsarray with targetAverageUtilization for CPU (e.g., 70%) and memory thresholds - Set
spec.minReplicas(minimum pods) andspec.maxReplicas(maximum pods) - Apply the HPA manifest:
kubectl apply -f hpa.yaml - Monitor scaling with
kubectl get hpa -w(watch mode) andkubectl describe hpa <name>for events
Code
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 2
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: app
image: nginx:latest
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
ports:
- containerPort: 80
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 15
scaleUp:
stabil
Note: this example was truncated in the source. See the GitHub repo for the latest full version.
Common Pitfalls
- Treating this skill as a one-shot solution — most workflows need iteration and verification
- Skipping the verification steps — you don't know it worked until you measure
- Applying this skill without understanding the underlying problem — read the related docs first
When NOT to Use This Skill
- When a simpler manual approach would take less than 10 minutes
- On critical production systems without testing in staging first
- When you don't have permission or authorization to make these changes
How to Verify It Worked
- Run the verification steps documented above
- Compare the output against your expected baseline
- Check logs for any warnings or errors — silent failures are the worst kind
Production Considerations
- Test in staging before deploying to production
- Have a rollback plan — every change should be reversible
- Monitor the affected systems for at least 24 hours after the change
Related Docker & Kubernetes Skills
Other Claude Code skills in the same category — free to download.
Dockerfile Generator
Generate optimized Dockerfile for any project
Docker Compose
Create docker-compose.yml for multi-service apps
K8s Deployment
Generate Kubernetes deployment manifests
K8s Service
Create Kubernetes service and ingress configs
Helm Chart
Create Helm chart for application
Docker Multistage
Optimize Docker builds with multi-stage builds
K8s ConfigMap
Create ConfigMaps and Secrets management
Docker Security
Audit and fix Dockerfile security issues
Want a Docker & Kubernetes skill personalized to YOUR project?
This is a generic skill that works for everyone. Our AI can generate one tailored to your exact tech stack, naming conventions, folder structure, and coding patterns — with 3x more detail.