GitLab Runner Autoscaling with Fleeting Plugin Prompt

You are a senior platform engineer who has migrated GitLab Runner autoscaling from the deprecated docker-machine to the modern fleeting plugin. You know AWS/GCP/Azure autoscaling group integration, capacity policies, and spot usage at scale. I will provide: - The cloud provider (AWS, GCP, Azure) - Current setup (legacy docker-machine, fleeting, or new install) - Workload patterns (peak/off-peak, job duration distribution, concurrency) - The goal: install / migrate / tune Your job: 1. **Understand the architecture**: - **Fleeting** is a plugin system; GitLab Runner uses it to spin up VM-based workers on demand - Plugins: `aws`, `gcp`, `azure`, `static`, `googlecloud` - Replaces docker-machine (deprecated since 2024) - Each "instance" is a full VM that runs jobs; can have its own Docker daemon, executor config 2. **Key concepts**: - **Capacity per worker** — concurrent jobs per VM (`use_static_credentials` controls) - **`max` per ASG/MIG/VMSS** — upper bound - **`max_use_count`** — how many jobs before a VM is replaced (mitigates state leakage) - **`idle.count` / `idle.time`** — minimum idle VMs and idle scale-down delay - **`min` / `max`** instances 3. **Migration from docker-machine**: - Install fleeting binary on runner host - Update `config.toml` from `[runners.machine]` to `[runners.autoscaler]` - Re-create ASG/MIG/VMSS with fleeting-friendly config - Cutover: drain old, switch traffic 4. **For AWS**: - `[runners.autoscaler.plugin]` = `aws` - Requires IAM permissions for EC2 and ASG operations - Launch template defines AMI, instance type, user-data - Spot supported via mixed instances policy 5. **For Kubernetes executor with autoscaler**: - Different: Kubernetes executor uses pods (no VM autoscaling needed at runner level) - K8s itself handles autoscaling (cluster autoscaler / Karpenter scales nodes for new pods) - Fleeting isn't needed in this case 6. **For capacity tuning**: - Peak concurrent jobs observed (P) → set `max = P × 1.2` - Average job duration → cost of leaving idle vs scale-up latency - For bursty workloads: keep `idle.count > 0` to absorb peaks - Spot reclaim: ensure jobs are tolerant (rerun on failure) 7. **For cost optimization**: - Spot/preemptible for non-critical CI - Larger instance types with higher capacity per worker - Aggressive scale-down for off-peak - Pre-pull common Docker images via user-data to speed first-job 8. **For monitoring**: - Fleeting plugin emits Prometheus metrics - Per-instance lifecycle events - Capacity utilization dashboard Mark DESTRUCTIVE: scaling down idle VMs aggressively (jobs queue waiting for cold-start), spot instances for long-running critical jobs (reclaim mid-job), modifying launch template while jobs running (next-replaced VM uses new config). --- Cloud provider: [AWS / GCP / Azure / K8s] Current setup: [docker-machine legacy / fleeting / new] Workload pattern: [DESCRIBE — peak/off-peak, concurrency] Goal: [install / migrate / tune cost / tune performance]

Why this prompt works

Runner autoscaling has changed significantly with fleeting. Many teams still run docker-machine and don’t realize it’s deprecated. This prompt walks the modern setup.

How to use it

Install fleeting on a runner host (not the workers).
Configure plugin per cloud.
Tune capacity based on observed peaks.
Monitor via Prometheus.

Useful commands

# Install fleeting plugin
sudo apt install gitlab-runner

# Install AWS plugin
sudo gitlab-runner exec fleeting-plugin-aws --version    # or similar; check docs

# Verify
gitlab-runner --version
gitlab-runner verify

# Plugin binaries (depending on OS package)
ls /usr/local/bin/fleeting-plugin-*

# Check pre-deprecation usage
grep -r "MachineDriver" /etc/gitlab-runner/config.toml

Config patterns

AWS fleeting

[[runners]]
  name = "aws-autoscaler"
  url = "https://gitlab.example.com"
  token = "TOKEN"
  executor = "docker-autoscaler"

  [runners.docker]
    image = "alpine:latest"

  [runners.autoscaler]
    plugin = "aws"
    capacity_per_instance = 4
    max_use_count = 10
    max_instances = 50

    [runners.autoscaler.plugin_config]
      name             = "gitlab-runner-pool"     # ASG name
      region           = "us-east-1"

    [[runners.autoscaler.policy]]
      idle_count = 2
      idle_time = "20m"
      periods = ["* * 9-17 * * mon-fri *"]   # peak hours

    [[runners.autoscaler.policy]]
      idle_count = 0
      idle_time = "5m"                        # off-peak: no idle

GCP fleeting

[[runners]]
  [runners.autoscaler]
    plugin = "googlecloud"
    capacity_per_instance = 4
    max_use_count = 10
    max_instances = 50

    [runners.autoscaler.plugin_config]
      project          = "my-project"
      instance_group   = "gitlab-runner-mig"
      zone             = "us-central1-a"

Static (no autoscaling, for fixed worker pool)

[[runners]]
  [runners.autoscaler]
    plugin = "static"
    capacity_per_instance = 1

    [runners.autoscaler.plugin_config]
      instances = ["host1.example.com:22", "host2.example.com:22"]

ASG / Launch Template (AWS)

# Terraform / CloudFormation for the launch template
LaunchTemplate:
  Type: AWS::EC2::LaunchTemplate
  Properties:
    LaunchTemplateName: gitlab-runner-workers
    LaunchTemplateData:
      ImageId: ami-XXX                  # Ubuntu 22.04 with Docker pre-installed
      InstanceType: m6a.large
      IamInstanceProfile:
        Name: gitlab-runner-worker
      UserData: !Base64 |
        #!/bin/bash
        # Pre-pull common images for faster first-job
        docker pull node:20-alpine
        docker pull python:3.12-slim
        # Configure runner OS

AutoScalingGroup:
  Type: AWS::AutoScaling::AutoScalingGroup
  Properties:
    AutoScalingGroupName: gitlab-runner-pool
    MinSize: 0
    MaxSize: 50
    DesiredCapacity: 2
    HealthCheckType: EC2
    LaunchTemplate:
      LaunchTemplateName: gitlab-runner-workers
      Version: $Latest
    MixedInstancesPolicy:
      LaunchTemplate:
        LaunchTemplateSpecification:
          LaunchTemplateName: gitlab-runner-workers
          Version: $Latest
        Overrides:
        - InstanceType: m6a.large
        - InstanceType: m5a.large
        - InstanceType: m6i.large
      InstancesDistribution:
        OnDemandBaseCapacity: 1               # always keep 1 on-demand
        OnDemandPercentageAboveBaseCapacity: 25
        SpotAllocationStrategy: capacity-optimized

Migration from docker-machine

# OLD: docker-machine
[runners.machine]
  IdleCount = 2
  MachineDriver = "amazonec2"
  MachineName = "gitlab-runner-%s"
  MachineOptions = ["amazonec2-region=us-east-1", ...]

# NEW: fleeting (replace with [runners.autoscaler])
[runners.autoscaler]
  plugin = "aws"
  capacity_per_instance = 4
  max_use_count = 10
  max_instances = 50
  [runners.autoscaler.plugin_config]
    name   = "gitlab-runner-pool"
    region = "us-east-1"
  [[runners.autoscaler.policy]]
    idle_count = 2

Common findings this catches

Jobs queue every Monday morning → idle.count too low for peak; add scheduled policy.
Spot reclaim killing jobs mid-run → use on-demand for critical, spot for retryable.
Cold-start every job → max_use_count: 1 is over-conservative; raise to 5-10.
Idle VMs costing money off-hours → time-window policy: zero idle outside business hours.
First-job-of-day slow due to image pull → pre-pull common images in user-data.
ASG max reached but jobs still queue → raise max; capacity plan.
Worker host failing health checks → user-data error; check CloudWatch / instance logs.

When to escalate

Cloud capacity unavailable (instance types out of stock in AZ) — coordinate with cloud team.
Spot pricing spike — fallback to on-demand or scale down.
Persistent autoscaler errors — engage fleeting plugin maintainers (open issue with logs).

GitLab Runner Autoscaling with Fleeting Plugin Prompt

Why this prompt works

How to use it

Useful commands

Config patterns

AWS fleeting

GCP fleeting

Static (no autoscaling, for fixed worker pool)

ASG / Launch Template (AWS)

Migration from docker-machine

Common findings this catches

When to escalate

Related prompts

GitLab Pipeline Audit & Slow Job Hunt Prompt

GitLab Runner Troubleshooting Prompt

Kubernetes Cluster Autoscaler / Karpenter Debug Prompt

Why this prompt works

How to use it

Useful commands

Config patterns

AWS fleeting

GCP fleeting

Static (no autoscaling, for fixed worker pool)

ASG / Launch Template (AWS)

Migration from docker-machine

Common findings this catches

When to escalate

Related prompts

GitLab Pipeline Audit & Slow Job Hunt Prompt

GitLab Runner Troubleshooting Prompt

Kubernetes Cluster Autoscaler / Karpenter Debug Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet