open source · python sdk

AI mobile automation framework
Android and iOS

Open-source framework for AI agents that control Android and iOS devices. LLM-agnostic. Multi-agent architecture. Built for developers.

Get Started

☆ Star on GitHub

MIT Licensed · Python 3.10+ · Android 10+

agent.py

1import asyncio
2from droidrun import DroidAgent, DroidConfig
3 
4async def main():
5    config = DroidConfig()
6    agent = DroidAgent(
7        goal="Open Twitter and post 'Hello from droidrun!'",
8        config=config,
9    )
10    result = await agent.run()
11 
12asyncio.run(main())

Works with any LLM

Gemini

OpenAI

Anthropic

Ollama

DeepSeek

OpenRouter

Qwen

Gemini

OpenAI

Anthropic

Ollama

DeepSeek

OpenRouter

Qwen

Gemini 2.5 Pro/Flash · OpenAI GPT-5.4 / o3 · Claude Opus 4.6 / Sonnet 4.6 · DeepSeek R2 · Qwen 3.5 · any Ollama model

watch it in action

See droidrun control real Android devices with natural language.

automate anything on mobile

Reliably and effortlessly. Three agent archetypes for every workflow.

Task Agents

Automate complex multi-app workflows through natural language

[ ]

Testing Agents

Write mobile tests in plain English — no Appium, no XPath

Research Agents

Extract structured data from any mobile app

built in the open

Join thousands of developers building the future of mobile automation.

GitHub Stars

Discord Members

Product Hunt

Contributors

“Replaced our entire Appium suite in a weekend. 400 test cases, all natural language now.”

— @mchen

senior QA eng

“The A11Y tree approach is genius. 2KB vs 1MB screenshots changed everything for our CI pipeline.”

— @danielk

infra lead

“Cross-app automation is the killer feature. Nothing else can chain WhatsApp to Sheets to Gmail.”

— @priya_s

automation eng

“Action replay saved us thousands in LLM costs. Record once, replay at 100ms/step.”

— @jmartinez

founding eng

“pip install and running in 30 seconds. Compared to 2 days setting up Appium.”

— @tomr

mobile dev

“Typed extraction with Pydantic models is clean. App store scraping that returns real objects.”

— @sg_data

data eng

“Replaced our entire Appium suite in a weekend. 400 test cases, all natural language now.”

— @mchen

senior QA eng

“The A11Y tree approach is genius. 2KB vs 1MB screenshots changed everything for our CI pipeline.”

— @danielk

infra lead

“Cross-app automation is the killer feature. Nothing else can chain WhatsApp to Sheets to Gmail.”

— @priya_s

automation eng

“Action replay saved us thousands in LLM costs. Record once, replay at 100ms/step.”

— @jmartinez

founding eng

“pip install and running in 30 seconds. Compared to 2 days setting up Appium.”

— @tomr

mobile dev

“Typed extraction with Pydantic models is clean. App store scraping that returns real objects.”

— @sg_data

data eng

AndroidWorld benchmark

droidrun achieves state-of-the-art performance on AndroidWorld with 91.4% success rate across 116 diverse tasks.

Success Rate 91.4%

106 out of 116 tasks completed successfully

View Full Benchmark →

Accessibility tree vs. screenshots

Traditional mobile automation uses screenshots and pixel matching. We use the accessibility tree — 500x smaller payloads.

screenshot

~1,024 KB

a11y tree

~2 KB (500x smaller)

everything you need to ship

Six primitives that cover the full mobile automation lifecycle.

LLM Agnostic

Swap providers with one line. Gemini, OpenAI, Anthropic, Ollama, DeepSeek, OpenRouter, Qwen and more.

Vision Mode

Screenshot-based fallback when the accessibility tree is insufficient. Combine both for maximum accuracy.

Reasoning Mode

Multi-step planning with chain-of-thought. The agent reasons about UI state before acting.

Structured Output

Extract typed data with Pydantic models. Get real objects, not raw strings.

Custom Tools

App Cards

Pre-built interaction patterns for popular apps. Skip the boilerplate, start automating.

device setup

Use your own Android or iOS devices with the open-source Portal app. Secure, real-time, fully wireless. No root or jailbreak required.

terminal

droidrun setup
droidrun ping
[ok] device paired -- ready

Setup Guide — Android Setup Guide — iOS

start local. scale to cloud.

Same agent framework. Run locally on your device or remote on mobilerun.

open source

droidrun

• pip install
• Local devices
• Full control

Shared Protocol

SDK / Trajectories / Prompts

cloud scale

mobilerun

• Hosted devices
• Parallel agents
• Stealth mode

Start with droidrun

Run automation locally using your own Android devices or emulators. Install with pip, inspect every line of code, and stay fully in control — no cloud setup, no vendor lock-in.

Scale with Cloud

When you're ready to go beyond local, mobilerun offers hosted devices, unlimited parallel agents, global proxies, and stealth-mode execution. Production-grade infrastructure, zero maintenance.

	Legacy	droidrun	mobilerun
Core Features
Can read screen content	✓	✓	✓
Can interact with device	✓	✓	✓
Continues to work when UI changes	✗	✓	✓
Extract data using natural language	✗	✓	✓
Framework Standards
Easy to setup	✗	◐	✓
Self-healing	✗	✓	✓
Open source	✓	✓	✗
Infrastructure
Stealth & anti-detection	✗	✗	✓
Hosted device fleet	✗	✗	✓
No setup required	✗	✗	✓

Try mobilerun →

from zero to agent in 60 seconds

Three commands. One device. Full automation.

terminal

$ pip install droidrun
Successfully installed droidrun-0.5.8

$ droidrun setup
Portal is installed and accessible. You're good to go!

$ droidrun run "Open Settings and check battery level"

[agent] planning steps...
[step 1/2] tap "Settings" on home screen
[step 2/2] navigate to "Battery"

[done] task completed in 3.8s -- 2 actions, 0 retries

frequently asked questions

How is droidrun different from using Claude/GPT with computer use?+

Generic LLMs send full screenshots to the model (~1MB per frame). droidrun uses the accessibility tree (~2KB) — that’s 500x smaller, 10x faster, and dramatically cheaper. Plus droidrun has Action Replay (record once, replay without LLM costs). A generic agent explores blindly. droidrun knows what it’s doing.

How is droidrun different from Appium?+

Appium requires writing test scripts with exact element selectors that break when the UI changes. droidrun is AI-native — describe what you want in natural language and the agent figures it out. It self-heals when UI changes, works on real devices, and supports both Android and iOS. Users report replacing entire Appium suites in a weekend.

Which LLMs does droidrun support?+

droidrun is fully LLM-agnostic. Works with OpenAI (GPT-4, GPT-4o), Anthropic (Claude Sonnet, Opus), Google (Gemini 2, Gemini 3), DeepSeek, Ollama (local models), OpenRouter, and Qwen. Switch models with a single config change.

What is the difference between droidrun and mobilerun?+

droidrun is the open-source framework (the agent brain) — install with pip, run on your own devices, full control. mobilerun is the cloud platform (the device infrastructure) — hosted real Android and iOS phones with stealth mode, residential proxies, and fleet management. Start local with droidrun, scale to the cloud with mobilerun. Same agent protocol.

Is droidrun free?+

Yes — droidrun is open-source under the MIT license. pip install droidrun and you’re running in 30 seconds. For hosted cloud devices with stealth mode and residential proxies, mobilerun plans start at $5/month with a free tier for OpenClaw users.

start building today.

Clone the repo, run your first automation, and scale to production on mobilerun.ai when you're ready.

Read the Docs ☆ Star on GitHub

Documentation PyPI Discord

AI mobile automation framework Android and iOS