Skip to main content
Get The App
All articles
TechnologyMay 1, 2025· 9 min read

How AI Is Changing iOS Development: What Actually Works

After integrating Claude Code, Codex, and supporting tools across real projects, here's what consistently delivers — and where AI still reliably fails.

AI coding tools have been part of iOS development workflows long enough to separate the genuine time-savers from the hype. After integrating Claude Code, Codex, and supporting tools across multiple projects, the picture is nuanced: there are patterns that consistently work, areas where AI still reliably fails, and a shift in what it means to do mobile development day-to-day.

This article shares what has proven useful in practice — not as a benchmark of any single tool, but as a set of workflow patterns worth adopting.

1. Keep AGENTS.md minimal

Configuration files for AI agents tend to grow over time. A file that starts at 10 lines often reaches 150+ as edge cases accumulate. The evidence from practice points the other way: shorter configs produce more relevant, first-pass-correct edits.

The reason is mechanical — language models have effective context limits, and after roughly a hundred thousand tokens, the "working" portion of attention degrades. A bloated instruction file sitting at the top of context displaces the actual code being edited.

A config of around 8 lines — quick links to spec documents, four core rules, and nothing else — consistently outperforms a comprehensive 180-line rules file. Everything else can be linked and pulled in on demand.

2. Use AGENTS.md as the single source of truth

Working with multiple tools — Cursor, Claude Code, Codex — means managing multiple config formats: .cursorrules, CLAUDE.md, AGENTS.md. Keeping these in sync by hand is error-prone and creates subtle divergences in agent behavior across tools.

The practical solution is to treat AGENTS.md as the canonical file, since it has the broadest support across OpenAI, Cursor, and the open-source ecosystem. Other tool configs become symlinks to it rather than independent files.

One important caveat: the Claude Code system prompt changes silently and takes priority over AGENTS.md. For rules that must never be overridden — branch naming, commit format, file ownership — pre-commit hooks are a more reliable enforcement mechanism.

3. CLI over MCP for Xcode interactions

Apple released an official MCP server for Xcode, but it operates through UI automation of the Xcode application itself. In practice this makes it slower and less predictable than direct CLI access.

The tools worth integrating directly:

  • xcodebuild — for building targets and running tests
  • xcrun simctl — for simulator management, screenshots, and log access
  • xcbeautify — for human-readable build output

Common operations that benefit from CLI access: taking simulator screenshots for visual verification, streaming app logs without opening Console.app, running clean builds in CI.

The one area where Xcode MCP remains useful is rendering SwiftUI Previews to images — the CLI path for this is possible but significantly more involved.

4. Vision models for Figma-to-implementation verification

One of the more effective patterns for maintaining design fidelity is using a vision model to compare a rendered SwiftUI Preview against the source Figma frame. The workflow:

  • Render the target view via Xcode MCP
  • Pull the reference frame from Figma via Figma MCP
  • Pass both images to a vision model with a comparison prompt
  • Receive a structured report of discrepancies — spacing, colors, typography

This is worth integrating into a pre-commit step via make verify, generating a diff report before the PR is opened. It catches the mechanical mismatches early, before a human design review that should be focused on composition and intent rather than pixel counting.

This doesn't replace design review — it makes design review more valuable by removing the low-level noise before a human ever looks at the screen.

5. Video as the primary review artifact

Watching a one-minute screen recording of an agent executing a user flow is significantly faster to evaluate than reading a several-hundred-line diff. Video makes the actual behavior visible rather than requiring the reviewer to reconstruct it mentally from code changes.

The recommended pattern:

  • Spin up a clean simulator
  • Have the agent execute the scenario described in the spec
  • Record via xcrun simctl io booted recordVideo
  • Attach the video, screenshots, and description to the merge request

A useful side effect: agents that review their own recordings before submitting a PR frequently catch regressions themselves and self-correct — without any human intervention.

For purely logical or unit-level changes, video adds overhead without benefit. Apply it selectively to UI flows and interaction-heavy features.

6. Write the spec before writing code

The most consistent predictor of a smooth AI-assisted sprint is whether a written spec exists in the repository before development starts. Not in Jira, not in a Slack thread — in the repo, versioned alongside the code.

A spec that works in practice covers:

  • Context and goals
  • Technical drivers and constraints
  • Current state of the affected code
  • Options considered and why one was chosen
  • Implementation details
  • Definition of Done
  • QA notes and edge cases

Three to eight pages depending on scope. The spec survives the session — two weeks later, any developer (or agent) can reconstruct full context without relying on chat history or memory.

For greenfield code, looser specs are acceptable — the code will likely be rewritten several times anyway. For legacy codebases, the cost of a wrong move is higher and tighter specs pay back quickly.

7. Definition of Done as a verifiable contract

Vague completion criteria lead to feedback loops where each fix introduces a new problem. The pattern worth adopting is a Definition of Done built entirely from commands with boolean outcomes — either they pass or they don't.

A working DoD for an iOS feature looks like:

  • xcodebuild build completes with zero warnings
  • All unit tests pass
  • All E2E tests pass with video attached
  • SwiftUI Preview matches Figma frame (diff report attached)
  • Spec updates reflected in the index

With this structure, agents improvise less and stay inside the boundaries of what was agreed. It also makes progress legible — every item is either checked or it isn't.

8. Parallel sessions with git worktrees

Running multiple agent sessions on the same working directory creates conflicts. The clean solution is one session per git worktree — each agent works in isolation with no risk of overwriting another's changes or contaminating context.

Setting this up is straightforward:

  • git worktree add ../app-feature-a feature/a
  • git worktree add ../app-feature-b feature/b
  • Run separate agent sessions in each directory

For teams running several parallel features, this pattern essentially eliminates merge conflicts caused by concurrent AI-assisted work.

9. Where AI consistently underperforms

Honest assessment requires naming the failure modes, not just the wins:

  • Swift Concurrency — @MainActor, Sendable conformance, isolated contexts. Models produce plausible-looking code that contains race conditions or won't compile.
  • ObservedObject → Observable migration — old and new APIs get mixed. Requires explicit spec and frequent checkpoints.
  • Build settings and .xcconfig files — faster and safer to handle manually.
  • App Store Connect and provisioning — not a good use of agent time.
  • Objective-C interop — Bridging Header and import chains confuse most models reliably.
Knowing where not to use AI saves more time than optimizing the areas where it works. Handing Swift Concurrency work to an agent without close supervision creates debugging work that costs more than the initial time savings.

10. A realistic picture of the productivity impact

The tools in regular use across a typical iOS project:

  • Claude Code — primary agent for implementation
  • Codex via ChatGPT — cross-checking and second opinions
  • GitHub Copilot — PR-level review
  • Serena MCP — codebase navigation and symbol lookup
  • Xcode MCP — SwiftUI Preview rendering only
  • Figma MCP + Vision — design verification

The cost for a solo developer runs to roughly $100/month across subscriptions. The impact varies by sprint — some weeks it saves two days, others it adds overhead when the AI pursues the wrong approach and needs to be redirected.

The honest average over time is positive, but not by an order of magnitude. The gains are real and worth pursuing — they just require the workflow infrastructure described above to materialize.

The shift happening in iOS development

The nature of the work is changing. Less time is spent writing code directly; more time goes into writing specs, reviewing video recordings of agent behavior, and verifying outputs against design references.

This isn't a degradation of the craft — it's a change in what the craft looks like. The judgment about what to build, how to structure it, and whether the result is actually correct still belongs entirely to the developer. AI handles more of the mechanical translation from spec to working code.

What makes the difference between workflows that benefit from AI and those that don't is investment in process: minimal configs, written specs, verifiable completion criteria, and honest limits on where AI is trusted to work unsupervised.

Ready to build?

Tell us about your project

Free estimate within 48 hours. No commitment required.

Get a free estimate