The most common mistake is treating Claude Code as "another AI autocomplete" — install the CLI, bind an API key, ship a function. That is the same illusion as calling a self-hosted runner "macos-latest without a queue": you see speed, not boundaries.
Claude Code is a terminal Agent: it does not only suggest code — it can run shell, edit many files, read env vars, call MCP tools, and loop tests until green. Default-trust it on your main repo and you are not adding a plugin; you are handing over a full stack of code and execution permissions.
This is Cloud Mac AI Stack · L3 decision opener (L3-Q01): after L0 foundation and L1 Fact layer (read L1 ①②③ first), we answer when an Agent should officially join your dev workflow. L3 series table at § L3 series; vertical L0 and horizontal L4–L5 at § Stack links. The workstation benchmark (L3 ③) and vs Cursor (L3 ②) cover different ground — this article is decision and permissions only.
Before you read · L3 series and Stack entry
L0 foundation: buy vs rent Cloud Mac · move AI workstation to cloud
L1 series (read in order): ① execution engine → ② queue and TCO → ③ workspace isolation
L2 Inference: Ollama private inference · parallel scheduling with Runner
L3 series (starts here): ① this article · permission handover and onboarding decision → ② vs Cursor → ③ workstation benchmark (full table at § L3 series)
Often in the same stack (L4–L5): MCP setup · MCP least privilege · OpenHands
Featured answer
Claude Code is not "another AI tool" — it hands shell, Git, secrets, and multi-file execution to an Agent; formal onboarding requires boundary audit and supervised merge first.
- Ready to onboard: cross-directory refactors, test–fix loops; L4 MCP least privilege and
CLAUDE.mdboundaries in place - Do not default to full access: production secrets on the same host, no code review; L1 Runner and Agent sharing disk
- Trial path: read-only sandbox → supervised write → formal L3 Diff with L1 Fact (Runner) split
What you are actually handing over: a permission map
Most teams obsess over "how smart is the model" and skip what the Agent can do on disk and in processes. Use the table below in onboarding — six permission types; check each before go-live.
| Permission type | Typical Claude Code capability | What handing it over means |
|---|---|---|
| Shell execution | npm test, xcodebuild, git, arbitrary scripts |
Bad prompts or malicious steps can delete files, install deps, change system config |
| Filesystem | Read/write repo, generate patches, edit config | One delegation can touch dozens of files; missed edits are harder to review than single-file bugs |
| Git history | commit, branch, sometimes push | A bad merge to main costs far more than "one wrong line" |
| Env vars / secrets | Read .env, ~/.zshrc, CI-injected secrets |
Mixed with L4 MCP PAT and L1 Runner PAT, exposure multiplies |
| Network / tools | MCP pull repos, call APIs, read Issues | Toolchain permissions = Agent permissions; see L4 MCP triple-connect hub |
| Persistent state | Session memory, CLAUDE.md, local cache |
Context from the last task shapes the next decision |
So the question is never "should we use AI to write code" but: are you willing to default all six rows above to a semi-autonomous process? If you hesitate, skip "everyone installs Claude Code by default" and use phased adoption below.
It is not Copilot, and not an IDE plugin
Copilot (Copilot / Cursor Tab): you drive in the editor; AI completes or edits the current file — small diffs, fast feedback. Chauffeur (Claude Code Agent): you state a goal; the Agent plans steps, opens shell, edits many files, retries on failure — you review outcomes, not the wheel.
This is not about which is "better" (see Claude Code vs Cursor) but task type: daily completion stays in the IDE; cross-module migrations, large test–fix loops, and delegating GitHub Actions CI edits belong to the Agent. Using Agent as Copilot is often slow and hard to audit; using Copilot as Agent cannot solve "47 files changed" delegations.
Blind rollout vs formal onboarding · compare
"Blind rollout" in teams often looks like: the lead loves it, so everyone gets Max and the main repo runs the CLI with default trust. Formal onboarding writes the Agent into engineering policy: boundaries, audit, and CI split.
You think you are "trying a tool" — you are changing the security model
Blind rollout (common practice) Formal onboarding (2026 baseline) Trap consequence
| Dimension | Blind rollout (2024–2025 common) | Formal onboarding (2026 recommended) | Trap consequence |
|---|---|---|---|
| Permission mindset | "Just an AI assistant, should be fine" | Default: Agent = trusted code executor | Mistakes blamed on "dumb model," shell logs never checked |
| Secrets | One PAT / API key for IDE, Agent, and CI | Agent / MCP / Runner separate tokens | Agent session leak takes down CI and private repos |
| Repo boundary | Run claude at monorepo root |
CLAUDE.md + directory rules + read-only trial |
Edits wrong modules, deletes generated artifacts |
| CI relationship | SSH green = done | Diff local / Fact on Runner split | Local green, Actions red; or dirty workspace poisons CI |
| Review | Glance at diff and merge | Large delegations need human review + test checklist | "47 files changed" slips into production |
| Toolchain | MCP wide open for convenience | Least-privilege MCP + audit | Agent reads repos it should not via MCP |
| Team rhythm | Hero workflow, no docs | Onboarding gates in runbook | New hires copy "guru config" and repeat incidents |
Three gates: must pass before formal onboarding
Treat the three gates below as the minimum bar for formal onboarding — not perfection, but avoiding "full permission handover with zero audit."
Gate ① · Disk and CI boundary (L1)
Do Agent and GitHub Runner share uncleanable global directories? Do production signing, .env, and broad caches live in the same home as Agent sessions? If L1 ③ one job, one workspace is not done, fix the Fact layer before opening Diff wide (L1 series at L1 series).
Gate ② · Tools and secrets boundary (L4)
Is MCP "connect everything"? Does the Agent PAT overlap CI and personal GitHub? Formal onboarding needs separate tokens, minimal scopes, rotation, plus a team-readable L4 MCP setup and least-privilege checklist.
Gate ③ · People and process boundary (team)
Who may merge Agent output directly? Does a large repo have L4 CodeGraph or equivalent to cut "missed file" risk? If the answer is "whoever is fastest merges," the Agent only amplifies existing process debt.
When to onboard formally · when to wait
| Scenario | Recommendation | Notes |
|---|---|---|
| Cross 10+ file refactor / migration, many test–fix loops | Formal onboarding | Agent strength; pair with review and Runner verification |
| Cloud Mac / Mac mini ready with L1 isolation in place | Formal onboarding | Diff and Fact can split; L2 parallel scheduling in stack |
| Single-file completion, daily small edits only | Wait | IDE + Cursor is cheaper; see vs Cursor |
| Production secrets same user / same home as Agent | Not yet | Split users / tokens / workspace first |
| Open-source repo with many fork PRs + self-hosted CI | Do not default to full access | Agent editing workflows stacks with L1 ③ workspace isolation |
| Personal private repo, solo maintainer, willing to review diffs | Pilot OK | Still phased; avoid one PAT for everything |
What L3 owns in the Stack: Diff, not Fact
Series slogan (L1–L3): Claude Code produces Diff; GitHub Runner produces Fact. This L3 opener asks: when are you willing to hand Diff production to an Agent? Fact (CI green, signing pass) still happens on isolated Runners — Agent saying "tests passed" is not release.
L2 Inference: Ollama for drafts and offline; L3 Claude Code for delegated execution and Diff — can coexist on one host with separate permissions. L4 Context: MCP hub and least privilege govern tools. L5 Workflow: OpenHands leans orchestration; Claude Code leans terminal depth — onboarding gates apply to both.
Phased adoption (~30% of workflow)
After the decision is clear, land in three phases — not day one "everyone on Max + production repo wide open":
- Phase A · read-only sandbox (1–3 days): fork or copy repo, no push; watch how Agent breaks down tasks and which shell it runs. Goal: feel the permission map.
- Phase B · supervised write (1–2 weeks): main repo read-only clone in a separate dir, or branch-only work; every merge needs human review; MCP only essential tools.
- Phase C · formal Diff layer: fixed split with L1 Runner;
CLAUDE.md, token rotation, workspace isolation in runbook; optional L1 ④ OpenClaw pipeline trigger chain.
# Agent onboarding gates (Phase B) 1. Repo root must have CLAUDE.md (allowed/forbidden paths, test commands) 2. Agent uses dedicated PAT, scope ≤ current task; never same as CI secrets 3. Single delegation >15 file changes → second reviewer required 4. Before merge, same test command must pass locally or on Runner 5. Different macOS user or Cloud Mac node than Runner (recommended)
Hardware and environment choice (buy Mac mini vs rent Cloud Mac) is out of scope — that is the workstation benchmark story. This decision opener only says: once permissions and process pass the bar, then talk daily use.
L3 series · how articles split
This article opens the L3 (Diff layer) decision line: answer whether to hand permissions to an Agent, then read tool compare and hands-on. Read the table in order; vertical back to L0–L2, horizontal to L4–L5 at § Stack links.
| Part | Topic | Role vs this article |
|---|---|---|
| ① · this article | Permission handover · when to formally onboard Agent | Decision opener · this article |
| ② · vs Cursor | Terminal Agent vs AI IDE choice | Tool compare · not permission framework |
| ③ · workstation benchmark | Hardware / Cloud Mac trial and screenshots | Hands-on story · not team gates |
Stack layer links · vertical entry
Stack vertical links (one entry per layer; read alongside L1 series):
- L0 · foundation: Mac mini vs cloud Mac · cloud AI workstation
- L1 · Fact: Runner execution engine · CI queue · workspace isolation · OpenClaw pipeline
- L2 · Inference: Ollama private inference · parallel scheduling with Runner
- L3 · Diff: ① this article · onboarding decision · vs Cursor · workstation benchmark
- L4 · Context: MCP triple-connect hub · least-privilege exposure · MCP setup · CodeGraph and missed edits
- L5 · Workflow: OpenHands Agent platform
After this L3 opener, if gates pass, next is usually L3 ③ workstation benchmark (hardware and billing); if still choosing an IDE, read L3 ② vs Cursor first. L6 end-to-end map is planned.
FAQ
How is Claude Code different from Cursor autocomplete?
Cursor is an in-editor copilot — changes are usually line-by-line visible. Claude Code is a terminal Agent that can run shell, edit many files, and loop tests — it hands execution to a semi-autonomous process.
Do solo developers need all three gates?
You can simplify, but separate tokens, separate dirs, review large diffs should stay. Private repo ≠ zero risk — deletes and secret leaks still happen.
Does formal onboarding mean replacing the IDE?
No. Common pattern: IDE for features, Agent for delegations. This article answers when to write Agent into process, not when to drop VS Code.
How does this relate to L1 Runner security?
L1 ③ workspace isolation governs CI disk boundaries; this article (L3 ①) governs who may run Agent on production repos. L1 first, then L3 wide open. Stack entry at § Stack links.
Cloud Mac or local machine for trial?
Phases A/B on L0 Cloud Mac isolated trial is often cheaper — reset a messy env; buy dedicated hardware after daily use is confirmed. See L3 ③ workstation benchmark.
Decision passed · next hands-on
Gates met — read the workstation benchmark
This article answers whether to formally onboard an Agent. Next is a week on Cloud Mac / Mac mini with Claude Code — screenshots and billing to turn decision into daily use.
Read Claude Code workstation benchmark