What is the difference between OpenHands and Claude Code?

Claude Code is the paired coding layer (L3 Diff): you are present, confirming step by step. OpenHands is the autonomous Workflow layer (L5): it receives a goal, then plans, executes, and debugs multi-step tasks, producing auditable workflow results.

Can I deploy OpenHands without a GitHub Runner?

Technically yes, but organizationally risky: if the Agent edits code without stable macOS CI (L1 Fact), the team cannot judge whether autonomous tasks are truly mergeable. Establish Runner first, then add OpenHands.

How do OpenHands and OpenClaw divide work?

OpenClaw focuses on orchestration and receipts (who triggers what, command sequence, audit logs). OpenHands focuses on autonomous code intelligence (plan→execute→debug loop). Both can run on the same machine with different roles.

Is OpenHands suitable for directly editing production code?

Not as an unguarded prod write channel. Recommended for sandbox repos, internal tools, and rollback-friendly branches; regulated or high-compliance systems should limit Agent write permissions and enforce human review.

How do I self-host OpenHands on Mac?

Recommended: always-on Cloud Mac macOS node, OpenHands sandbox via Docker, and GitHub Runner on the same host for CI validation. Full docker compose and env in L5-Q02 tutorial.

How much memory does OpenHands need?

OpenHands only: M4 16GB to start. With Claude Code or Ollama 7B, M4 24GB; with Ollama 14B and Runner, M4 Pro 48GB.

OpenHands: From Tool Collection to Agent Platform on Cloud Mac

Q: How does OpenHands work? What is the agent loop?

OpenHands completes tasks through a Plan→Execute→Observe→Debug cycle: Plan with Context, Execute to produce Diff, Observe Fact (tests/build), Debug and loop until a PR is ready.

Over the past two weeks in this Stack series we stood up L1 Runner (Fact), L2 Ollama (Inference), and L4 MCP (Context) layer by layer. Reader feedback keeps repeating one line: “Every tool is connected, but I still manually string the workflow every day.” Claude Code can produce a diff, MCP can pull GitHub context, Runner can go green after push — yet “fix issue #142 and open a PR” still means someone staring at a terminal for forty minutes.

That is what L5 · OpenHands answers: not another CLI purchase, but upgrading Cloud Mac from a tool collection into an Agent platform that can autonomously finish multi-step engineering tasks. This page is L5-Q01 · R1 · series Hub: it moves readers from “coding tools” to “Agent platform” thinking — where Workflow sits in the Stack, why OpenHands vs Claude Code is not a replacement story, typical tasks, and OpenHands self-hosted on macOS architecture. No Docker install steps here (that is the L5-Q02 SEO landing page).

Workflow layer

Step agent loop

24GB

Suggested RAM with Ollama

Cloud Mac AI Stack · series slogan (fourth ring)

Claude Code produces Diff; GitHub Runner produces Fact; OpenHands produces Workflow.

MCP supplies Context; Ollama supplies optional Inference. Workflow consumes Context / Diff / Fact and calls the latter two repeatedly in a loop — not a one-way pipeline. See Stack language.

The “tool collection” trap: every piece works; the chain still runs on humans

A typical week on site (we have seen this shape in customer repos many times):

Monday: Claude Code edits the API layer, MCP pulls the GitHub issue list — smooth inside the session.
Tuesday: a teammate pushes from another machine; Runner goes red — nobody aligned Agent edits with CI scripts (without reading the L1 execution engine, this repeats).
Wednesday: manual test runs, config tweaks, another Claude Code session to patch files.
Thursday: checks finally green, but docs, migration scripts, sample tests are still missing — because “edit code” and “deliver the requirement” were treated as the same job.

A tool collection means: each step has a best tool, but no layer owns the whole requirement. An Agent platform adds a Workflow layer that can decompose tasks, execute, and retry from failure on its own — OpenHands is the open-source option in the Stack for that layer (evolved from the OpenDevin ecosystem).

Stack language: Workflow with Context / Diff / Fact

Series-wide notation: do not draw Workflow as a one-way downstream of Fact. Workflow (L5) is the orchestration layer that repeatedly consumes Context, produces Diff, and validates with Fact until it decides “requirement done”:

Cloud Mac AI Stack · output relationships (not call order)

  Workflow (L5 · OpenHands)
  ├── Context (L4 · MCP)          ← read repo / issue / API
  ├── Diff (L3 · Claude Code etc) ← edit code / write files
  └── Fact (L1 · Runner / tests)  ← run test / build / CI signal

Agent loop (inside Workflow · may iterate many times)
       Diff  ↔  Fact
         ↑       ↓
      Observe → re-Plan → re-Execute …

Four outputs to remember: Context · Diff · Fact · Workflow (MCP · coding layer · Runner · OpenHands). Inference (L2 · Ollama) is optional and omitted above to avoid confusion with the Agent loop.

Layer	Component	Output	Question answered
L4	MCP	Context	What can the Agent see?
L3	Claude Code	Diff	What is this change?
L1	GitHub Runner	Fact	Will the org trust it?
L5	OpenHands	Workflow	Is the whole requirement done?

Workflow is not “another CI job” but a multi-step, interruptible and resumable task state machine: it calls Diff and Fact many times in a loop until the PR is deliverable. Claude Code excels at single-round Diff; OpenHands excels at running the whole loop unattended — provided Context and Fact are already in place.

OpenHands in one minute (not an encyclopedia)

OpenHands is an open-source autonomous software engineering Agent platform: in a sandbox (often Docker) it accepts natural-language goals and automatically plans → writes code / runs commands → reads output → debugs, with GitHub integration (issues, PRs, CI status). In the Cloud Mac AI Stack it does not replace Claude Code’s paired experience or Runner’s objective build proof — it orchestrates multi-step delivery on top of both.

Different from “install another MCP Server”

MCP extends the context boundary (read repo, call APIs); OpenHands extends task depth (decide the next tool call, whether to retry). Without L4, OpenHands edits blind; without L1, OpenHands “done” cannot be accepted by the org.

OpenHands vs Claude Code: why they are not competitors

People searching OpenHands vs Claude Code or Claude Code alternative often ask: can one replace the other? In the Cloud Mac AI Stack the answer is no, and it should not — they sit on different layers with different outputs:

Claude Code (L3) → produces Diff: paired coding, you are present.
OpenHands (L5) → produces Workflow: autonomous agent, you set the goal.

Treating OpenHands as “another Claude Code” fails fast: OpenHands does not align fuzzy product intuition in your head; Claude Code does not unattended-run an eight-step issue. The right pattern is stacked use — Claude Code for hard problems by day, OpenHands clearing the issue queue at night.

Dimension	Claude Code (L3 · Diff)	OpenHands (L5 · Workflow)
Interaction	Human present, step-by-step confirm	Goal-driven, multi-step autonomy
Typical duration	5–30 minute session	30 minutes to hours per task
Strength	Complex single-point refactor, align intent	Scripted requirements, batch small changes, templated features
Risk	Session ends → partial work	Runaway edits, too many files, excess permissions
Output	Diff	Workflow (PR, logs, step trajectory)
OpenHands alternative?	❌ No	❌ Not a Claude Code replacement

Rule of thumb (not a contract): if you can state the change intent in one PR sentence → Claude Code; if you say “finish the issue” → consider OpenHands. In large repos with CodeGraph indexing, the paired layer often stays Claude Code; OpenHands fits templatized backend tasks. Both share MCP Context but not the same responsibility.

What tasks can OpenHands do? (the first search question)

Many people search OpenHands agent or OpenHands github to ask: what work is reliable to hand off? Below are task types we suggest trying first with tests + CI — also the example pool for the L5-Q02 tutorial:

Task type	Typical input	Expected delivery	Fit
Fix bug	GitHub issue + repro steps	Patch + tests + PR	⭐⭐⭐⭐
Dependency upgrade	“Upgrade React 18→19”	Lockfile + breaking-change fixes	⭐⭐⭐⭐
Lint cleanup	ESLint / SwiftLint report	Batch fix warnings, no behavior change	⭐⭐⭐⭐⭐
Generate tests	Uncovered module list	Unit test PR	⭐⭐⭐
Documentation sync	API change diff	README / OpenAPI sync	⭐⭐⭐⭐
Scaffold / boilerplate	“Add REST endpoint” template	Routes + test skeleton	⭐⭐⭐⭐

Poor first OpenHands tasks: untested large refactors, major UX redesign, schema migrations needing business approval, anything touching production secrets. Keep those in Claude Code paired sessions; let Runner produce Fact after human gate.

How OpenHands works

People searching How OpenHands works, OpenHands architecture, or OpenHands agent loop want to know: how does an autonomous agent turn one sentence into a mergeable PR? OpenHands centers on a four-step loop — often written Plan → Execute → Observe → Debug:

Phase	What it does	Consumes
Plan	Read issue, split subtasks, file list	Context (MCP, GitHub, repo tree)
Execute	Write patches, run shell, call tools	Produces Diff
Observe	Read test output, lint, build logs	Consumes Fact (local test or Runner)
Debug	Revise plan or code from Observe	Back to Execute; loop until pass

OpenHands agent loop (concept · not a single pipeline)

        ┌──────────┐
        │   Plan   │  ← Context
        └────┬─────┘
             ▼
        ┌──────────┐
        │ Execute  │  → Diff
        └────┬─────┘
             ▼
        ┌──────────┐
        │ Observe  │  ← Fact (test / build / CI)
        └────┬─────┘
             │
      fail   │  pass
             ▼
        ┌──────────┐        ┌─────────────┐
        │  Debug   │ ──────▶│ Workflow done│ → PR / delivery
        └────┬─────┘        └─────────────┘
             │
             └──── back to Plan or Execute (next round)

This is not “install a stronger Chat.” OpenHands architecture hinges on a stateful task machine — each Observe result is written into trajectory for the next Plan. Fix bug, lint cleanup, and similar tasks are the same loop with different Plan entry sentences. Real task replay below walks one issue through all four steps; Docker and UI config land in L5-Q02.

Stack L0–L4 before L5 — or the Agent performs in a sandbox alone

We oppose “install OpenHands on day one” tool stacking. Recommended order matches the L1 rollout sequence, with L5 after MCP:

L0 — Always-on Cloud Mac macOS node.
L1 — Runner: repeatable push → green/red.
L2–L3 — Optional Ollama + Claude Code coding on top of Fact.
L4 — MCP Hub + permission model: auditable read/write for Agents.
L5 — OpenHands: multi-step Workflow.

Without L1, OpenHands can technically run and open PRs, but the team cannot judge merge risk — the same org incident as “Claude Code SSH all green, Actions all red.” Without L4 permissions, autonomous Agent token exposure grows; see the MCP security spec.

Real task replay: one full Plan → Execute → Observe → Debug round

Below is a task shape we replay on an OpenHands sandbox fork (numbers illustrative). Match each step to the agent loop:

Goal: fix issue #218 "CSV export missing UTF-8 BOM"

  ① Read issue + related src/export/*.ts     ~2 min · Context (MCP/Git)
  ② Generate 6-step plan                     ~1 min · Plan
  ③ Edit 4 files + add 1 test                ~8 min · Execute
  ④ Run pnpm test → fail (snapshot mismatch) ~3 min · Observe · Fact
  ⑤ Read logs → edit 2 more files            ~5 min · Debug → re-Execute
  ⑥ Re-run tests → green                     ~3 min
  ⑦ Open PR, link issue                      ~1 min · Workflow delivery

~23 min wall time · human: approve goal + final merge review only

Note step ④ (Observe): test failure is not disaster — it is agent loop input. In paired tools you fix on the spot; OpenHands feeds Fact back through Observe → Debug into the next Execute. Without a stable test command (L1 not solid), Observe has no signal and the loop spins — another reason Runner comes before OpenHands.

Trigger-side concept (not an install tutorial — only how it connects to GitHub):

# Concept: issue label triggers autonomous task (pseudocode)
on:
  issues:
    types: [labeled]
if: github.event.label.name == 'agent:openhands'
run: |
  openhands run \
    --repo "${{ github.repository }}" \
    --issue "${{ github.event.issue.number }}" \
    --max-iterations 40 \
    --sandbox docker

Runner · OpenClaw · OpenHands: three names, three roles

The triangle we get asked about most on L5 articles:

Component	Stack layer	Metaphor	Typical action
GitHub Runner	L1 · Fact	Legs	`xcodebuild`, `pnpm test`, archive
OpenClaw	Orchestration (not main Stack tier)	Dispatch desk	Trigger order, receipts, audit, ACK
OpenHands	L5 · Workflow	Autonomous engineer	Read requirement, edit code, iterate to PR

OpenClaw does not make architecture decisions — it answers “when to run, how to notify when done.” OpenHands does not sign iOS packages for Runner — it produces reviewable PRs and step logs. All three can stack on one Cloud Mac, but do not merge their duties into one runbook.

Typical OpenHands architecture on Cloud Mac (self-hosted · Docker · macOS)

Engineers searching OpenHands Mac, OpenHands macOS, OpenHands self-hosted, or OpenHands Docker want a deployable topology — not install steps, but “where components live.” Our recommended minimum production shape on Apple Silicon Cloud Mac:

OpenHands self-hosted on macOS (Cloud Mac · L0 base)

  GitHub (issues / webhooks / PR)
           │
           ▼
  ┌─────────────────────────────────────┐
  │  Cloud Mac · macOS · Apple Silicon   │
  │  ┌─────────────┐  ┌───────────────┐  │
  │  │ OpenHands   │  │ Claude Code   │  │  L5 Workflow + L3 Diff (same host OK)
  │  │ (Docker)    │  │ (SSH/terminal)│  │
  │  └──────┬──────┘  └───────────────┘  │
  │         │ sandbox workspace           │
  │         ▼                             │
  │  ┌─────────────┐  ┌───────────────┐  │
  │  │ MCP Servers │  │ Ollama (opt)  │  │  L4 Context · L2 Inference
  │  └─────────────┘  └───────────────┘  │
  │         │ git push                    │
  │         ▼                             │
  │  ┌─────────────────────────────────┐  │
  │  │ GitHub Runner (self-hosted)     │  │  L1 Fact
  │  └─────────────────────────────────┘  │
  └─────────────────────────────────────┘

Why Workflow on Cloud Mac, not a laptop?

Duration — OpenHands tasks often run 30–90 minutes; lid-closed laptop breaks them.
OpenHands Docker — sandbox needs a stable daemon; 24/7 Cloud Mac fits better.
Same stack as Runner — Agent edits → Runner validates on the same macOS node, fewer “SSH green, Actions red” cases.
ABI alignment — iOS / macOS target repos on Apple Silicon beat forcing Docker on a Linux VPS.

Minimal OpenHands Docker start shape (concept snippet; full docker compose and env in L5-Q02 tutorial):

# Cloud Mac · OpenHands self-hosted (illustrative)
docker pull docker.all-hands.dev/all-hands-ai/openhands:0.9
docker run -d --name openhands \
  -e SANDBOX_USER_ID=$(id -u) \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v $HOME/.openhands:/.openhands \
  -p 3000:3000 \
  docker.all-hands.dev/all-hands-ai/openhands:0.9

Dedicated node ≠ automatic safety

Cloud Mac solves compute and macOS ABI; OpenHands still needs repo-level least privilege (bot branches, no prod secrets). Future L6 Agent Ops / Governance will cover audit, policy, and human gates — this page establishes Workflow; governance is the next ring in the series schedule.

L5 Agent Stack sizing: from Workflow to which Cloud Mac to rent

After architecture, the natural question: what machine for the Workflow layer? Sizing from real stacked workloads (not contract SLA — to shorten decisions):

Scenario	Suggested config	Notes
OpenHands Only (light issues, no local inference)	M4 · 16GB	Docker sandbox + API LLM; good to trial agent workflows
OpenHands + Claude Code (paired + autonomous same host)	M4 · 24GB	Diff by day, Workflow by night; avoid CI memory fights
OpenHands + Ollama 7B	M4 · 24GB	Private Inference + Agent; see off-peak scheduling
OpenHands + Ollama 14B + Runner	M4 Pro · 48GB	14B resident + sandbox + daily macOS CI; lowest Swap risk
iOS team (OpenHands issues + xcodebuild CI)	M4 · 24GB+	Agent and Runner co-located; reserve 8GB+ for archive peaks

This is the commercial narrative Workflow → Cloud Mac → sizing: confirm you need L5 capability first, then pick a node that holds Docker + (optional) Ollama + Runner — not rent hardware then stack tools backward.

Fit and misfit: boundaries matter more than Agent hype

Better for OpenHands (L5)	Poor fit / use caution
Internal tools, scaffolding, docs sites, test backfill	Regulated finance / healthcare core paths without human gate
Repos with clear issue templates and decent test coverage	Repos with no tests, no CI — “ship first”
Repeatable migrations (dependency upgrades, lint batch fixes)	Major UX needing strong product intuition
Existing L1 Runner + L4 MCP permission policy	Secrets scattered in repo, no token rotation
Team accepts “Agent PR + human merge”	Agent must push main / auto-release prod

Scripts and small services: yes; unguarded compliant prod writes: no. OpenHands is an engineering accelerator, not a liability-free “auto DevOps.”

Decision: should your Cloud Mac upgrade to an Agent platform?

Self-check below — hit ≥3 left-column rows before investing in L5; otherwise shore up L1/L4 first.

Ready for OpenHands	Not yet
≥5 “small but complete” issues queued per week	Main pain is “no macOS CI”
Runner green but lots of manual step stringing	Claude Code sessions still unstable
MCP permissions and bot accounts tiered	GitHub PAT with full repo admin
Willing to maintain sandbox and task logs	No one does merge review
Cloud Mac 24GB or Ollama off-peak scheduled	16GB running 14B + Agent + Xcode together

Decision (not a summary): OpenHands value is not “smarter Chat” but Cloud Mac becoming a platform accountable to requirements — provided Fact (Runner) and Context/permissions (MCP) already stand. Otherwise you only automate manual stringing; the org still will not merge.

L5 series: from decision to first autonomous task

Part	qid	Topic	Status
① · this page	L5-Q01	Tool collection → Agent platform (decision R1)	Published
②	L5-Q02	Install OpenHands on Cloud Mac + first autonomous task	Next
③	L5-Q03	OpenHands vs OpenClaw division in depth	Planned
④	L5-Q04	Runner + OpenHands: auto PR after CI failure?	Planned
⑤ · L6 extension	L6-Q05	Agent Ops / Governance (Context→Workflow→policy)	📅 6/16

Before the L6 loop closes, finish at least ②: without a reproducible OpenHands tutorial, the Hub lacks a landing page. Full stack map at L6-Q01; the series evolves from “AI Tool Stack” to “AI Engineering Platform,” with L6-Q05 Agent Governance as the final ring.

FAQ

How does OpenHands work / what is the agent loop?
Plan → Execute → Observe → Debug; consumes Context, produces Diff, validates Fact — see how it works.

OpenHands vs Claude Code — which to pick?
Not either/or. Diff with Claude Code, multi-step issues with OpenHands — see OpenHands vs Claude Code.

OpenHands tutorial / how to install?
This Hub has no step-by-step install. Docker + first issue→PR in L5-Q02 (planned 6/14).

Can OpenHands self-host on Mac?
Yes. Always-on Cloud Mac + Docker sandbox; architecture at typical architecture.

OpenHands without Runner?
It runs; we do not recommend it. Establish L1 first.

What about OpenClaw?
Orchestration vs autonomous engineering — see triangle roles and OpenClaw notes.

What Cloud Mac size?
See L5 Agent Stack sizing; OpenHands only M4 16GB; with Claude Code or Ollama 7B prefer 24GB.

L5 Agent Stack · sizing

Pick Cloud Mac specs for your workload

OpenHands Only → M4 16GB · + Claude Code → M4 24GB · + Ollama 14B + Runner → M4 Pro 48GB. Clarify the Workflow layer in this Hub, then run your first autonomous task in the L5-Q02 tutorial.

View Cloud Mac pricing and specs

OpenHands: from tool collection to Agent platform on Cloud Mac