Comparison

Anvil vs Maesto

Maesto asks you to write every tap. Anvil writes the taps for you — then ships the findings to Slack, Linear, and GitHub before your CI finishes.

Maesto pioneered the YAML-scripted mobile flow. It's elegant on the tutorial repo and painful on a production backlog: every new flow is another file to maintain, every onboarding change is a schema migration, and every locale means another matrix.

Anvil takes a different posture. The specs are YAML too, but they describe intent ("onboard a new user and land on the home feed") not steps. The autonomous agent decomposes the intent against the live app, recovers from layout drift, and writes the step-by-step replay afterwards — so the spec stays high-level and the replay stays faithful.

When the App Store rejects your binary, Anvil's rejection regression suite catches it in CI on the next release. Maesto has no equivalent. You write the test yourself, from the rejection email, and hope you remember to run it next time.

Feature-by-feature comparison of Anvil and Maesto
Feature	Anvil	Maesto
Test authoring model	Intent-driven specs + agent	Step-by-step YAML flows
AI agent runtime	Yes	No
Rejection regression suite Every prior App Store rejection is a permanent spec.	Yes	No
Adversarial matrix (9-axis pairwise)	359 specs across device / network / locale / a11y / interrupts	No
iOS driver	KID — 82 native verbs	Embedded WebDriver bridge
Android driver	KAD — UiAutomator2 + Accessibility	UiAutomator wrapper
visionOS / macOS / watchOS	KVD / KMD / KWD — same verb shape	No
Real-device orchestration	BYOD + rentable fleet	Local emulators only
Perf gating in CI	Baselines + P95 budgets per release	No
Customer support	Dedicated Customer Success on every plan	Community forum
MCP server (Claude / Cursor / Windsurf)	Yes	No
Pricing	Free tier + transparent usage-based	Usage-based

Why teams switch to Anvil

Teams migrating from Maesto usually delete 60–80% of their spec files in the first week — the agent subsumes the happy paths.
The adversarial matrix (9 stressful axes crossed pairwise) catches real-world combinations that scripted tests never will: weak network × non-Latin locale × memory pressure × incoming call.
Drivers are native (Swift, Kotlin, ObjC++), not WebDriver bridges — a real tap, real accessibility tree, real perfetto trace per run.
The rejection suite is the kind of institutional memory most startups only build after their third App Store rejection. Anvil ships it on day one.

See it on your own suite

Free for 100 runs/month. No credit card. No call required to start.

Start free Install the CLI

Feature

Anvil

Maesto

Test authoring model

Intent-driven specs + agent

Step-by-step YAML flows

AI agent runtime

Yes

Rejection regression suite

Every prior App Store rejection is a permanent spec.

Yes

Adversarial matrix (9-axis pairwise)

359 specs across device / network / locale / a11y / interrupts

iOS driver

KID — 82 native verbs

Embedded WebDriver bridge

Android driver

KAD — UiAutomator2 + Accessibility

UiAutomator wrapper

visionOS / macOS / watchOS

KVD / KMD / KWD — same verb shape

Real-device orchestration

BYOD + rentable fleet

Local emulators only

Perf gating in CI

Baselines + P95 budgets per release

Customer support

Dedicated Customer Success on every plan

Community forum

MCP server (Claude / Cursor / Windsurf)

Yes

Pricing

Free tier + transparent usage-based

Usage-based

Why teams switch to Anvil

Teams migrating from Maesto usually delete 60–80% of their spec files in the first week — the agent subsumes the happy paths.

The adversarial matrix (9 stressful axes crossed pairwise) catches real-world combinations that scripted tests never will: weak network × non-Latin locale × memory pressure × incoming call.

Drivers are native (Swift, Kotlin, ObjC++), not WebDriver bridges — a real tap, real accessibility tree, real perfetto trace per run.

The rejection suite is the kind of institutional memory most startups only build after their third App Store rejection. Anvil ships it on day one.