Back to Skills
    🦞

    k8-autoscaling

    Configure Kubernetes autoscaling with HPA, VPA.

    By @rohitg00
    View on GitHub
    SKILL.md
    ---
    name: k8s-autoscaling
    description: Configure Kubernetes autoscaling with HPA, VPA, and KEDA. Use for horizontal/vertical pod autoscaling, event-driven scaling, and capacity management.
    ---
    
    # Kubernetes Autoscaling
    
    Comprehensive autoscaling using HPA, VPA, and KEDA with kubectl-mcp-server tools.
    
    ## Quick Reference
    
    ### HPA (Horizontal Pod Autoscaler)
    
    Basic CPU-based scaling:
    ```yaml
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    metadata:
      name: my-app-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: my-app
      minReplicas: 2
      maxReplicas: 10
      metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 70
    ```
    
    Apply and verify:
    ```
    apply_manifest(hpa_yaml, namespace)
    get_hpa(namespace)
    ```
    
    ### VPA (Vertical Pod Autoscaler)
    
    Right-size resource requests:
    ```yaml
    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: my-app-vpa
    spec:
      targetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: my-app
      updatePolicy:
        updateMode: "Auto"
    ```
    
    ## KEDA (Event-Driven Autoscaling)
    
    ### Detect KEDA Installation
    ```
    keda_detect_tool()
    ```
    
    ### List ScaledObjects
    ```
    keda_scaledobjects_list_tool(namespace)
    keda_scaledobject_get_tool(name, namespace)
    ```
    
    ### List ScaledJobs
    ```
    keda_scaledjobs_list_tool(namespace)
    ```
    
    ### Trigger Authentication
    ```
    keda_triggerauths_list_tool(namespace)
    keda_triggerauth_get_tool(name, namespace)
    ```
    
    ### KEDA-Managed HPAs
    ```
    keda_hpa_list_tool(namespace)
    ```
    
    See [KEDA-TRIGGERS.md](KEDA-TRIGGERS.md) for trigger configurations.
    
    ## Common KEDA Triggers
    
    ### Queue-Based Scaling (AWS SQS)
    ```yaml
    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: sqs-scaler
    spec:
      scaleTargetRef:
        name: queue-processor
      minReplicaCount: 0  # Scale to zero!
      maxReplicaCount: 100
      triggers:
      - type: aws-sqs-queue
        metadata:
          queueURL: https://sqs.region.amazonaws.com/...
          queueLength: "5"
    ```
    
    ### Cron-Based Scaling
    ```yaml
    triggers:
    - type: cron
      metadata:
        timezone: America/New_York
        start: 0 8 * * 1-5   # 8 AM weekdays
        end: 0 18 * * 1-5    # 6 PM weekdays
        desiredReplicas: "10"
    ```
    
    ### Prometheus Metrics
    ```yaml
    triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus:9090
        metricName: http_requests_total
        query: sum(rate(http_requests_total{app="myapp"}[2m]))
        threshold: "100"
    ```
    
    ## Scaling Strategies
    
    | Strategy | Tool | Use Case |
    |----------|------|----------|
    | CPU/Memory | HPA | Steady traffic patterns |
    | Custom metrics | HPA v2 | Business metrics |
    | Event-driven | KEDA | Queue processing, cron |
    | Vertical | VPA | Right-size requests |
    | Scale to zero | KEDA | Cost savings, idle workloads |
    
    ## Cost-Optimized Autoscaling
    
    ### Scale to Zero with KEDA
    Reduce costs for idle workloads:
    ```
    keda_scaledobjects_list_tool(namespace)
    # ScaledObjects with minReplicaCount: 0 can scale to zero
    ```
    
    ### Right-Size with VPA
    Get recommendations and apply:
    ```
    get_resource_recommendations(namespace)
    # Apply VPA recommendations
    ```
    
    ### Predictive Scaling
    Use cron triggers for known patterns:
    ```yaml
    # Scale up before traffic spike
    triggers:
    - type: cron
      metadata:
        start: 0 7 * * *  # 7 AM
        end: 0 9 * * *    # 9 AM
        desiredReplicas: "20"
    ```
    
    ## Multi-Cluster Autoscaling
    
    Configure KEDA across clusters:
    ```
    keda_scaledobjects_list_tool(namespace, context="production")
    keda_scaledobjects_list_tool(namespace, context="staging")
    ```
    
    ## Troubleshooting
    
    ### HPA Not Scaling
    ```
    get_hpa(namespace)
    get_pod_metrics(name, namespace)  # Metrics available?
    describe_pod(name, namespace)     # Resource requests set?
    ```
    
    ### KEDA Not Triggering
    ```
    keda_scaledobject_get_tool(name, namespace)  # Check status
    get_events(namespace)                        # Check events
    ```
    
    ### Common Issues
    
    | Symptom | Check | Resolution |
    |---------|-------|------------|
    | HPA unknown | Metrics server | Install metrics-server |
    | KEDA no scale | Trigger auth | Check TriggerAuthentication |
    | VPA not updating | Update mode | Set updateMode: Auto |
    | Scale down slow | Stabilization | Adjust stabilizationWindowSeconds |
    
    ## Best Practices
    
    1. **Always Set Resource Requests**
       - HPA requires requests to calculate utilization
    
    2. **Use Multiple Metrics**
       - Combine CPU + custom metrics for accuracy
    
    3. **Stabilization Windows**
       - Prevent flapping with scaleDown stabilization
    
    4. **Scale to Zero Carefully**
       - Consider cold start time
       - Use activation threshold
    
    ## Related Skills
    - [k8s-cost](../k8s-cost/SKILL.md) - Cost optimization
    - [k8s-troubleshoot](../k8s-troubleshoot/SKILL.md) - Debug scaling issues