Codex Mobile
OpenAI shipped Codex inside the ChatGPT mobile app. Not a mobile IDE — a remote control surface for agent work. Pair once, supervise all your projects from the phone. The execution boundary is the product.
OpenAI shipped Codex inside the ChatGPT mobile app. iOS, Android, preview, all tiers.
It is not a mobile IDE. It is a remote control surface for agent work running somewhere else. The interesting question is where that somewhere is.
Mobile is a cockpit, not a development machine. Start threads, approve actions, review diffs, watch test output, switch hosts. Not for typing code.
Two Codex shapes, one mobile surface. Codex cloud = isolated sandbox for scoped work. Codex remote = phone controls Codex against a connected Mac, SSH host, or managed devbox.
Pair-once beats per-session. Codex Mac app pairs to phone via QR code one time. From then on, the phone supervises every project on that Mac. Claude Code Remote defaults to per-session activation. The pair-once model is the better mobile-first ergonomics — my honest preference.
Setup is consumer-grade. Update both apps, scan QR, done. Mac stays awake + online + running Codex. Windows host support: coming soon.
The execution boundary is the product. Cloud sandbox vs connected host is the trade-off that decides almost everything else. Mobile sits above both.
Route tasks by where truth lives. Sandbox for disposable. Host when local state matters. Keyboard for sustained judgment.
What it is
OpenAI's launch post: Codex runs on laptops, devboxes, or remote environments; the mobile app loads live state from wherever it is operating. Project context, approvals, plugins, screenshots, terminal output, diffs, test results. All on the phone.
From mobile you can start threads, continue them, approve or reject actions, review everything streaming back, switch between connected hosts, and run multiple tasks in parallel.
The phone is not doing the work. The phone is holding the operator loop open.
The pairing model is where Codex pulls ahead
This is the part of Codex Mobile that pushed it above Claude Code Remote for me. OpenAI's Codex remote connections docs describe the pairing as host-level: scan the QR code once on the Mac, and the phone gets access to the same projects, threads, credentials, plugins, and local setup across every Codex task you run on that host. Pair once, supervise everything.
Claude Code Remote takes the opposite default. Per Anthropic's Remote Control docs: Remote Control only activates when you explicitly run claude remote-control, claude --remote-control, or /remote-control inside a session. Per-session pairing by default. New project? New session? You have to be at the desktop to enable the mobile surface.
Anthropic does offer a /config flag to auto-enable Remote Control on every interactive session, plus a server mode that can manage multiple. So the gap closes if you configure for it. But the default is opt-in per session, and defaults are what most operators actually run.
For mobile-first supervision across a working day, account-level pairing wins on friction. You wake up, open the phone, the agent stack is already reachable. The desktop is not the gate.
Privacy-conscious operators will prefer the opt-in default. That is a real trade-off, not a bug. Pick the one whose default matches how you actually use mobile.
The comparison that matters
Not which mobile UI is nicer. Where the agent executes, and what that implies.
The execution boundary is the product
Codex has two shapes operators should keep separate. Codex cloud: isolated sandbox per task, preloaded with the repo, bounded blast radius. Right for disposable work — small features, branch reviews, scoped bug investigations. Codex remote access: the phone controls Codex against a real environment (your Mac, a Mac mini, an SSH host, a managed devbox). Right when local state matters.
Sandboxes give you safety, reproducibility, cleaner enterprise compliance posture, and the ability to let agents run without a babysitting laptop. They cost you environment truth — your VPN, your local Postgres, your .env, your half-applied migrations, the company DNS that only resolves from inside the office. Approximation is fine for many tasks. It is deadly for some.
Connected hosts give you the real machine — and the real consequences. Mobile approval becomes the policy checkpoint. Approve while distracted and the trust model collapses. The phone shows the diff; it does not make the judgment.
Why mobile as control surface is the right shape
Long agent runs are wall-clock time interrupted by decision points. The operator does not need a desk for all of it — they need to be reachable when judgment changes the trajectory. Approvals are taps. Rejections are taps plus a sentence. Diff review on a phone is not great but is often enough to decide continue or stop.
Multi-thread management is where the cockpit pays off. One thread investigating a test failure, one drafting a refactor, one checking a dependency upgrade. The phone becomes the interrupt queue.
What it does not do
- Not a full IDE. No file tree editing, no sustained syntax work. Correct choice, still a limit.
- Not CLI parity. Deep config and slash commands stay on the host.
- Not universal host support yet. Mobile pairing is macOS-first; Windows coming.
- Not a review substitute. A pretty approval UI does not make a risky migration safe.
Five implications
1. The phone is part of the agent runtime loop. Long-running agents need low-friction human interrupts.
2. Pair-once is the right default for mobile-first. Per-session activation forces you to the desktop just to enable the mobile surface.
3. Codex is a multi-boundary product. Cloud sandbox, connected host, remote SSH, managed devbox. Mobile is the control plane above all four.
4. Cloud safety and local truth keep pulling against each other. Sandboxes are safer and cleaner. Hosts are more truthful and more dangerous. Serious teams use both.
5. Mobile approval is becoming table stakes. Once tasks run for real wall-clock time, requiring a desk is a throughput bug.
Route tasks by where truth lives
Codex cloud when the task is scoped, reviewable, disposable. Sandbox is a feature.
Codex Mobile over a connected host when the environment matters but you do not need to be physically present.
Claude Code Remote when the local-first model is the point. You accept the per-session setup or configure away the default.
Neither mobile path when the work demands sustained judgment. Security-sensitive merges, incident response, irreversible migrations, broad architecture changes. Sit at a keyboard.
Closing
Codex Mobile is the right product shape for where coding agents are going. Not because phones are good development machines — they are not. Because agent work is becoming longer, more parallel, and more interrupt-driven.
The architectural question is the same every time: what does the phone control? Cloud sandbox. Connected Mac. SSH host. Managed devbox. Real filesystem with real secrets and real consequences.
Pair once. Route by truth. Keep agents off YOLO.
Grok Build. Same-week companion piece on xAI’s terminal coding agent. Different bets.
Claude Code Remote Control. The local-first mobile control surface. Per-session by default; configurable to auto-enable.
Two Models, One Branch. Running multiple coding agents on the same repo.
My Agent Filed Its Own Ticket. Prompt injection lite. Why mobile approval is the policy checkpoint.