Many teams hang a self-hosted GitHub Runner on Cloud Mac or a Mac mini and think "no queue, xcodebuild runs" is enough — the queue and TCO article covers that layer.
The real trap is the next layer: shared workspace. Reusing DerivedData, global dependency caches, one PAT for both CI and Agent — fine when only maintainers touch workflows; once fork PRs, malicious post steps, and Claude Code / OpenHands auto-edit .github/workflows join the stack, a self-hosted runner without isolation is wide open: secrets from job A are still readable in job B.
The 2026 industry floor aligns here: one job, one workspace — exclusive directory per job, cleanup on exit, plus token rotation. This is Cloud Mac AI Stack · L1 part 3 (prereqs: ① execution engine · ② queue and trade-offs): where the CI/CD security trap comes from, then a copy-paste runbook. Series index at § L1 series.
Before you read · L1 series and Stack entry
L0 foundation: buy vs rent Cloud Mac · move AI workstation to cloud
L1 series (read in order): ① execution engine → ② queue and TCO → ③ this article · CI/CD security and one job, one workspace
Often in the same stack (L3–L5): Claude Code workflow · MCP setup · Ollama parallel scheduling · OpenClaw pipeline
Featured answer
Self-hosted runners without workspace isolation leave CI/CD wide open; the 2026 baseline is one job, one workspace + token rotation.
- Where traps come from: cross-job file residue, global cache poisoning, long-lived PATs on disk
- Industry baseline: isolated directory per job,
if: always()cleanup + host prune - Same stack as MCP least privilege and OpenHands: L4 governs Agent tokens; L1 governs Fact-layer disk boundaries
Compare: how teams used to configure vs 2026 baseline · where traps hide
Moving from hosted macos-latest to self-hosted, many teams only kept "no queue" (see L1 ②) and did not inherit the hosted "fresh disk per job". The spotlight table and diagrams below mark what felt "convenient" then and looks "wide open" now.
Yesterday's convenience = today's exposure — seven rows at a glance
Old common practice 2026 industry baseline Trap consequence
| Dimension | Old common practice (circa 2024) | 2026 industry baseline | Trap consequence |
|---|---|---|---|
| Job working directory | Multiple jobs share _work, run dirs never deleted |
One job, one workspace, cleared when job ends | Job B reads Job A's .env, implanted scripts |
| Build cache | Shared DerivedData / global actions/cache keys |
Repo-scoped cache keys + periodic prune | Cache poisoning; fork PR scans global cache |
| Credentials | One PAT for CI + Agent; secrets rotated but disk not wiped | Separate CI / MCP tokens; 30–90 day rotation + workspace wipe | PAT copies on disk stay valid |
| Signing material | Temp keychain in $HOME, no post delete |
Job-scoped keychain, destroyed with if: always() |
Next job or malicious step exports certs |
| Who edits workflows | Default: 2–3 maintainers only | Maintainers + fork PRs + Agent editing CI | Attack surface grows from people to semi-autonomous processes |
| Runner segmentation | One Mac runs all repos, all PRs | Prod / staging / fork use different labels | Low-trust workflow touches prod signing env |
| Hosted vs self-hosted myth | "Self-hosted = my machine, optimize for speed" | Self-hosted = you draw the security boundary (hosted VM auto-isolates) | Assumed self-hosted is safer; actually more exposed |
Why could teams tolerate the old way? Private repo, no fork PRs, daily CI only from internal jobs — shared disk just saved 3–8 minutes. Why must it change in 2026? The same Cloud Mac often runs Agent, MCP, and multi-repo jobs — any old habit turns convenience into a security incident.
Same self-hosted runner (same macOS user) Job A · nightly signing │ ├─► unpack *.mobileprovision, temp keychain ├─► write ~/Library/Developer/Xcode/DerivedData ← felt like "speed" └─► PAT in ~/.netrc (lazy script) ← felt like "convenience" │ │ not cleaned · _work and cache residue ▼ Job B · daytime fork PR "doc fix" │ └─► post step scans _work / DerivedData / .netrc ← trap: reads Job A residue Result: self-hosted runner wide open (hosted macos-latest drops whole VM per job)
Job A ──► workspace A (_work/.../run-id) ──► if: always() cleanup ──► ✓ Job B ──► workspace B (fresh directory) ──► if: always() cleanup ──► ✓ │ ├─ sensitive cache: repo-prefixed cache keys, or cron prune ├─ PAT: separate CI and MCP tokens, rotated on schedule └─ high-trust / low-trust jobs: different runner labels (second Cloud Mac if needed) Result: auditable Fact layer; aligns with "Claude Code produces Diff, Runner produces Fact"
At a glance · old vs now
- Old trap: treated self-hosted as "a faster server" and shared everything shareable
- 2026 baseline: treat self-hosted as "an execution environment you must disinfect"
- Biggest myth: migrating from
macos-latestand moving CPU, not isolation
Where wide-open starts: three common CI/CD security traps on self-hosted runners
Hosted macos-latest gets a fresh disk each job, so many teams first self-hosting assume "my machine, optimize for convenience". The real cost of self-hosted: you draw the security boundary — the hidden bill beyond "no queue" in L1 part 2.
The three misconfigs below show up repeatedly on Cloud Mac deployments; hit any two and you should adopt one job, one workspace immediately.
Why 2026 pushes one job, one workspace everywhere
Not because GitHub shipped a new rule — the attack surface grew: workflows run from maintainers, fork PRs, supply-chain scripts, and AI Agents on the same host. Three reasons "shared disk for speed" no longer holds in 2026.
1. Cross-job contamination: job A secrets still readable in job B
Self-hosted _work does not auto-disinfect between jobs. Signing certs unpacked in job A, .env from scripts, .netrc from post steps — job B can read them via relative paths or symlinks. In 2026 the surface is larger: Claude Code and OpenHands can edit .github/workflows on the same machine — you audit not just the diff but what still sits on disk when CI changes.
2. Global cache poisoning: DerivedData / npm cache as a shared backdoor
Sharing ~/Library/Developer/Xcode/DerivedData or broad actions/cache keys may work in a closed internal repo; once fork PRs arrive, a malicious post step can scan global cache — hosted VMs are destroyed, self-hosted without cleanup is a persistent attack surface. Classic iOS failure: nightly signing job and daytime "doc fix" PR on the same runner.
3. Long-lived PAT on disk: rotating secrets in UI is not enough
In many Cloud Mac stacks one GitHub PAT serves both MCP repo access and Runner artifact push. If Agent exposes it, CI falls too. Rotating secrets in GitHub UI without wiping workspace is like changing the lock cylinder but leaving copies on disk.
Stack roles · Fact layer cannot run wide open
Series slogan (from L1 opener): Claude Code produces Diff; GitHub Runner produces Fact. Pretty diffs mean nothing if Fact runs green in a dirty workspace. L4 MCP least privilege governs Agent tokens; L1 here governs disk and job boundaries — both layers, not either/or.
Align the model: what GitHub Actions leaves on a runner
Many equate "workspace" with the checked-out git tree — it is only part. After a macOS job, disk may still hold:
| Path / object | Typical contents | Risk if not cleaned |
|---|---|---|
_work/<repo>/<run-id>/ |
checkout, build artifacts, test output | next job reads generated files, malicious scripts outside source |
~/.npm, ~/Library/Caches |
dependency and tool caches | cache poisoning, cross-repo dependency confusion |
DerivedData, .swiftpm |
Xcode / Swift build cache | symbol leakage, stale signing config embedded |
Temp keychain, *.mobileprovision |
signing material | high risk: next job or malicious step exports certs |
env injection files, .netrc |
credentials written by CI scripts | plaintext PAT persists |
Hosted runners discard the whole disk when a job ends; self-hosted does not. On an Agent stack it gets worse: Claude Code session files, Ollama weights, and Runner _work may share one user home (same-machine scheduling in L2 parallel scheduling) — one job, one workspace means: assume the environment is dirty from Agent or another job at start, and at end keep only layers that should cross jobs (usually none).
Industry baseline in practice: what one job, one workspace means
Four layers, easy to hard:
- Directory isolation: each job uses its own
RUNNER_TEMP/ run directory; ban scripts writing to/tmp/sharedor repo-external "team shared" folders. - Process boundary: one runner process may run jobs serially, but must not share uncleaned global state between jobs (e.g. API keys exported into
~/.zshrc). - Credential boundary: signing temp keychains deleted in job
post; secrets only via env vars, never disk — if disk is required, path must live inside the run dir and delete with the job. - Ops boundary: high-risk repos get dedicated runners for trusted workflows only (label separation), physically apart from fork PR runners — on Cloud Mac that usually means a second node, not betting one Mac that cleanup scripts never fail.
Relation to ephemeral runners
GitHub Enterprise ephemeral self-hosted runners exit after each job and start fresh — automating one job, one workspace. If you use a persistent runner (common on Cloud Mac), reach similar effect with scripts + workflow conventions.
Token rotation: why wiping directories is not enough
Workspace cleanup fixes file residue; token rotation ensures copied files expire even if exfiltrated. Rotate at least three credential types:
- Runner registration token: remove and re-register the runner (or rotate per org policy) so stale registration on old machines cannot be abused.
- CI GitHub PAT / App: minimal scopes (read repo vs write packages), managed separately from MCP PAT policy — avoid one token for Agent + CI.
- Apple signing and third-party API keys: short-lived credentials or per-job injection from secrets; never write into runner home plist.
No silver-bullet cadence: private repo, no fork PRs, workflows editable only by maintainers — 90 days is often enough; with open contributors or Agent auto-submitting workflows, shorten to 30 days and after any incident immediately remove and re-register the runner.
Runbook: bake one job, one workspace into workflow
Paste these snippets into existing pipelines to make the baseline auditable. Assumes macOS self-hosted runner with default _work layout. If the same Cloud Mac also runs Claude Code, prefer a dedicated macOS user for Runner. MCP wiring: setup guide.
# .github/workflows/ios-ci.yml snippet jobs: build: runs-on: [self-hosted, macos, cloud-mac] steps: - uses: actions/checkout@v4 - name: Build iOS run: xcodebuild -scheme App -destination 'platform=iOS Simulator,name=iPhone 16' build - name: Scrub workspace (always) if: always() run: | rm -rf "$RUNNER_TEMP"/* rm -rf "$GITHUB_WORKSPACE"/build security delete-keychain ci_temp.keychain-db 2>/dev/null || true
#!/usr/bin/env bash # /usr/local/bin/runner-prune-work.sh · daily 03:00 cron set -euo pipefail WORK_ROOT="${HOME}/actions-runner/_work" # Delete run dirs older than 48h (by mtime) find "$WORK_ROOT" -mindepth 3 -maxdepth 3 -type d -mtime +2 -exec rm -rf {} + # Optional: prune DerivedData entries older than 7 days find ~/Library/Developer/Xcode/DerivedData -mindepth 1 -maxdepth 1 -type d -mtime +7 -exec rm -rf {} + 2>/dev/null || true
Verification: in two consecutive jobs, print ls -la "$GITHUB_WORKSPACE/.." and key cache paths; confirm the second job cannot see marker files from the first (e.g. touch /tmp/job-marker-$GITHUB_RUN_ID in post and check for residue).
Common misconfiguration
Using actions/cache to cache unsigned third-party binaries under a global cache key without repo scope — that builds a cross-job, cross-repo shared layer on the runner. Either tighten cache keys and branches, or include the cache directory in prune scripts.
Cloud Mac co-deployment: Agent + Runner security boundary
Typical Cloud Mac AI Stack: Claude Code (Diff) + Runner (Fact) + optional Ollama on one host. Saves queue time and git pull, but sharing uncleanable global directories means wide-open CI drags Agent sessions into the blast radius:
- User separation: Runner under
runneruser, Agent under developer user; never mixANTHROPIC_API_KEYand signing keys in one~/.zshrc. - Agent workspace ≠ CI workspace: Claude Code project dir must not point at Runner
_work; Agent patches go through git, not direct writes to CI cache trees. - Memory contention ≠ shared disk: Ollama vs Runner memory is parallel scheduling; high Swap is not an excuse to keep sharing DerivedData.
- Egress IP and labels: runners that reach internal staging must not also take fork PRs; Agent-submitted workflows hit low-privilege labels first, prod runner only after human promotion.
When you can defer (and when you must act now)
| Scenario | Can defer? | Notes |
|---|---|---|
| Private monorepo, 2–3 maintainers only, no fork PRs | Short term yes | Still do monthly manual prune + quarterly token rotation |
| Open-source repo runs Actions on external PRs | No | Dedicated runner or return to hosted macOS |
| Claude Code / OpenHands / MCP writes to repo | No | Default one job, one workspace; ban shared sensitive cache |
| Signing certs decrypted in CI | No | Job-scoped keychain + post delete required |
Pre-launch checklist (printable)
- Every job has
if: always()cleanup step or equivalent host prune - Temp keychain / signing files do not land in permanent
$HOMEpaths - High-risk repos and low-trust workflows use different runner labels
- Runner registration token and CI PAT have a rotation calendar (30–90 days suggested)
- First workflow PR from new contributor gets human review, not direct hit on prod runner
- Agent and CI use different PAT / App — no single token for MCP and Runner
- Cross-check L1 part 2: ② fixes "slow"; ③ fixes "is self-hosted wide open"
L1 series · how Stack layers connect
This article closes the L1 (Fact layer) security line: why Runner exists → whether self-hosted is worth it → how to land the 2026 one job, one workspace baseline. Read the table in order; go vertical to L0, horizontal to L3–L5.
| Part | Topic | Status |
|---|---|---|
| ① · Execution engine | Why Runner is Cloud Mac AI Stack L1 (Diff → Fact) | Published |
| ② · Queue and TCO | macOS CI queue time · self-hosted vs macos-latest | Published |
| ③ · this article | Self-hosted runner security · one job, one workspace baseline | Published |
| ④ · OpenClaw pipeline | Runner runs steps · OpenClaw orchestrates triggers and receipts (L1 extension) | Published |
Stack vertical links (one entry per layer):
- L0 · foundation: Mac mini vs cloud Mac · cloud AI workstation
- L2 · inference: Ollama private inference · parallel scheduling with Runner
- L3 · Diff: Claude Code workflow on Cloud Mac · CodeGraph and missed edits
- L4 · context: MCP triple-connect hub · least-privilege exposure
- L5 · workflow: OpenHands Agent platform
After the L1 trilogy, if Agent and CI share a Cloud Mac, the next layer is usually L3 Diff decisions (why Claude Code replaces a traditional IDE as a standalone opener — different from vs Cursor and the workstation article) — then the L6 end-to-end map (planned).
FAQ
Will one job, one workspace slow CI?
Cold starts get slower — that is why teams shared disk. The 2026 balance: delete only sensitive artifacts and per-job dirs; use repo-prefixed cache keys for renewable cache, not never-pruned global DerivedData.
When is a self-hosted runner "wide open"?
When multiple jobs/repos share signing material, .netrc, or broad global cache in one user home without if: always() cleanup — especially with fork PRs or Agent-edited workflows.
Can MCP least privilege alone fix shared runner disk?
No. MCP least privilege governs Agent tool calls; Runner still must govern files on disk. Malicious fork workflows skip MCP and can still scan _work residue.
Is there a GitHub one-click switch?
Hosted runners approximate "one click"; self-hosted needs workflow post steps, host cron, and optional ephemeral mode. No single actions/checkout flag replaces the full boundary.
How does this split from OpenClaw orchestration?
OpenClaw handles trigger order and receipts; Runner executes steps. Isolation lives in workflow and host — do not assume OpenClaw wipes disk (see L1 ④ · OpenClaw pipeline).
Where should I start the L1 series?
Suggested: ① execution engine → ② queue → ③ this article. Full table at § L1 series.
L1 trilogy done · next layer Diff
Fact layer secured — time for Claude Code
L1 answers where CI runs Fact. Next is usually L3: Claude Code workflow on Cloud Mac and IDE-replacement logic (different from the vs Cursor deep dive).
Read Claude Code workflow