Skip to content

ARC 0.14.0 silently fails to dispatch jobs on GitHub Enterprise Server 3.19 #4443

@Schwartz-Matthew-bah

Description

@Schwartz-Matthew-bah

Checks

Controller Version

0.14.0

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes.

To Reproduce

  1. Deploy gha-runner-scale-set-controller and gha-runner-scale-set at version 0.14.0 against a GitHub Enterprise Server 3.19 instance.
  2. Configure one or more org-level scale sets with any minRunners/maxRunners configuration.
  3. Verify that listener pods start, create sessions, and begin long-polling for messages (all succeed).
  4. Trigger a workflow job targeting one of the scale sets via runs-on.
  5. Observe that the job remains queued indefinitely. The listener never receives the job assignment.

Rolling back to 0.13.1 on the same cluster, same GHES instance, same runner groups, and same configuration immediately restores job dispatch.

Describe the bug

After upgrading from 0.13.1 to 0.14.0, all ARC scale sets stop receiving job assignments from GitHub Enterprise Server 3.19. The failure is completely silent:

  • Listener pods start normally (no crashes, no errors)
  • Session creation with the GHES Actions pipeline service succeeds
  • Long-polling via GET /_apis/runtime/runnerscalesets/{id}/messages succeeds (HTTP 200)
  • Listener logs show assigned job: 0 on every poll cycle
  • Jobs queued on GHES show runner_group_name: null via the API — GHES never matches them to a runner group

There are no errors, warnings, or any indication of incompatibility in the controller or listener logs. The listeners appear to be functioning normally, but GHES never dispatches jobs to them.

This affects all scale sets — we tested with 11 scale sets across two GitHub orgs, including a brand-new scale set name that had never been registered before. None received jobs on 0.14.0; all received jobs immediately on 0.13.1.

Describe the expected behavior

0.14.0 should either:

  1. Work correctly with GHES 3.19 (same as 0.13.1), or
  2. Fail loudly at startup or session creation with a clear error indicating the GHES version is not supported, or
  3. Document the minimum required GHES version in the release notes and chart README

Additional Context

Root cause hypothesis: 0.14.0 rewrote the client library used by both the controller and listener to communicate with the GHES Actions pipeline service:

The new scaleset client library appears to use a protocol or message format that GHES 3.19's Actions pipeline service does not fully support. The server accepts sessions and responds to poll requests, but never actually dispatches jobs through the new client.

GHES version details:

GHES Version Bundled Runner Version
3.19 (our version) v2.328.0
3.20 (latest) v2.330.0

ARC 0.14.0 ships runner v2.332.0, which is newer than what either GHES 3.19 or 3.20 bundles. We have not yet tested whether GHES 3.20 resolves the issue.

What we tried before discovering the version was the problem:

  • Full cluster wipe and clean redeploy of all ARC resources (same result)
  • Deleting and re-registering all scale sets (same result)
  • Creating a brand-new scale set with a name never registered before (same result)
  • Deleting all CRDs and letting the controller recreate them (same result)

Only rolling back to 0.13.1 resolved the issue.

Environment:

  • Kubernetes: EKS 1.34
  • GHES: 3.19
  • Deployment: ArgoCD app-of-apps pattern
  • Scale sets: org-level registration, multiple runner groups

Controller Logs

Controller logs from 0.14.0 show normal operation — no errors or warnings. The controller successfully reconciles all resources.

Runner Pod Logs

No runner pods are ever created because jobs are never assigned. The listener logs below show the silent failure:

0.14.0 listener (broken) — jobs never assigned:

{"time":"2026-04-10T04:40:35Z","level":"INFO","source":{"function":"...scaler.(*Scaler).setDesiredWorkerState","file":"...scaler/scaler.go","line":250},"msg":"Calculated target runner count","component":"worker","assigned job":0,"decision":0,"min":0,"max":100,"currentRunnerCount":0}
{"time":"2026-04-10T04:40:35Z","level":"INFO","source":{"function":"...listener.(*Listener).Run","file":"...listener/listener.go","line":178},"msg":"Getting next message","component":"listener","lastMessageID":1}

0.13.1 listener (working) — job assigned within seconds:

{"severity":"info","ts":"2026-04-10T04:53:07Z","logger":"listener-app.listener","message":"Getting next message","lastMessageID":1}
{"severity":"info","ts":"2026-04-10T04:53:14Z","logger":"listener-app.listener","message":"Processing message","messageId":2,"messageType":"RunnerScaleSetJobMessages"}
{"severity":"info","ts":"2026-04-10T04:53:14Z","logger":"listener-app.listener","message":"New runner scale set statistics.","statistics":{"totalAvailableJobs":0,"totalAcquiredJobs":1,"totalAssignedJobs":1,"totalRunningJobs":0,"totalRegisteredRunners":0,"totalBusyRunners":0,"totalIdleRunners":0}}
{"severity":"info","ts":"2026-04-10T04:53:14Z","logger":"listener-app.listener","message":"Job assigned message received","jobId":""}
{"severity":"info","ts":"2026-04-10T04:53:15Z","logger":"listener-app.worker.kubernetesworker","message":"Calculated target runner count","assigned job":1,"decision":1,"min":0,"max":100,"currentRunnerCount":1,"jobsCompleted":0}
{"severity":"info","ts":"2026-04-10T04:53:15Z","logger":"listener-app.worker.kubernetesworker","message":"Ephemeral runner set scaled.","namespace":"arc-runners","name":"<redacted>","replicas":1}

Note the completely different log format between versions — the 0.14.0 listener is a different binary built from the new scaleset library.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions