open source · python sdk

AI mobile automation framework
Android and iOS

Open-source framework for AI agents that control Android and iOS devices. LLM-agnostic. Multi-agent architecture. Built for developers.

Get Started ☆ Star on GitHub
MIT Licensed · Python 3.10+ · Android 10+
agent.py
1import asyncio
2from droidrun import DroidAgent, DroidConfig
3 
4async def main():
5 config = DroidConfig()
6 agent = DroidAgent(
7 goal="Open Twitter and post 'Hello from droidrun!'",
8 config=config,
9 )
10 result = await agent.run()
11 
12asyncio.run(main())

Works with any LLM

Gemini Gemini
OpenAI OpenAI
Anthropic Anthropic
Ollama Ollama
DeepSeek DeepSeek
OpenRouter OpenRouter
Qwen Qwen
Gemini Gemini
OpenAI OpenAI
Anthropic Anthropic
Ollama Ollama
DeepSeek DeepSeek
OpenRouter OpenRouter
Qwen Qwen

Gemini 2.5 Pro/Flash · OpenAI GPT-5.4 / o3 · Claude Opus 4.6 / Sonnet 4.6 · DeepSeek R2 · Qwen 3.5 · any Ollama model

watch it in action

See droidrun control real Android devices with natural language.

automate anything on mobile

Reliably and effortlessly. Three agent archetypes for every workflow.

>_

Task Agents

Automate complex multi-app workflows through natural language

[ ]

Testing Agents

Write mobile tests in plain English — no Appium, no XPath

~/

Research Agents

Extract structured data from any mobile app

“Replaced our entire Appium suite in a weekend. 400 test cases, all natural language now.”

— @mchen
senior QA eng

“The A11Y tree approach is genius. 2KB vs 1MB screenshots changed everything for our CI pipeline.”

— @danielk
infra lead

“Cross-app automation is the killer feature. Nothing else can chain WhatsApp to Sheets to Gmail.”

— @priya_s
automation eng

“Action replay saved us thousands in LLM costs. Record once, replay at 100ms/step.”

— @jmartinez
founding eng

“pip install and running in 30 seconds. Compared to 2 days setting up Appium.”

— @tomr
mobile dev

“Typed extraction with Pydantic models is clean. App store scraping that returns real objects.”

— @sg_data
data eng

“Replaced our entire Appium suite in a weekend. 400 test cases, all natural language now.”

— @mchen
senior QA eng

“The A11Y tree approach is genius. 2KB vs 1MB screenshots changed everything for our CI pipeline.”

— @danielk
infra lead

“Cross-app automation is the killer feature. Nothing else can chain WhatsApp to Sheets to Gmail.”

— @priya_s
automation eng

“Action replay saved us thousands in LLM costs. Record once, replay at 100ms/step.”

— @jmartinez
founding eng

“pip install and running in 30 seconds. Compared to 2 days setting up Appium.”

— @tomr
mobile dev

“Typed extraction with Pydantic models is clean. App store scraping that returns real objects.”

— @sg_data
data eng

AndroidWorld benchmark

droidrun achieves state-of-the-art performance on AndroidWorld with 91.4% success rate across 116 diverse tasks.

Success Rate 91.4%

106 out of 116 tasks completed successfully

Accessibility tree vs. screenshots

Traditional mobile automation uses screenshots and pixel matching. We use the accessibility tree — 500x smaller payloads.

screenshot
~1,024 KB
a11y tree
~2 KB (500x smaller)

everything you need to ship

Six primitives that cover the full mobile automation lifecycle.

LLM Agnostic

Swap providers with one line. Gemini, OpenAI, Anthropic, Ollama, DeepSeek, OpenRouter, Qwen and more.

Vision Mode

Screenshot-based fallback when the accessibility tree is insufficient. Combine both for maximum accuracy.

Reasoning Mode

Multi-step planning with chain-of-thought. The agent reasons about UI state before acting.

Structured Output

Extract typed data with Pydantic models. Get real objects, not raw strings.

Custom Tools

Register your own tools the agent can call. Extend capabilities beyond built-in actions.

App Cards

Pre-built interaction patterns for popular apps. Skip the boilerplate, start automating.

device setup

Use your own Android or iOS devices with the open-source Portal app. Secure, real-time, fully wireless. No root or jailbreak required.

terminal
droidrun setup
droidrun ping
[ok] device paired -- ready

start local. scale to cloud.

Same agent framework. Run locally on your device or remote on mobilerun.

open source

droidrun

  • pip install
  • Local devices
  • Full control
Shared Protocol
SDK / Trajectories / Prompts
cloud scale

mobilerun

  • Hosted devices
  • Parallel agents
  • Stealth mode
Start with droidrun

Run automation locally using your own Android devices or emulators. Install with pip, inspect every line of code, and stay fully in control — no cloud setup, no vendor lock-in.

Scale with Cloud

When you're ready to go beyond local, mobilerun offers hosted devices, unlimited parallel agents, global proxies, and stealth-mode execution. Production-grade infrastructure, zero maintenance.

Legacydroidrunmobilerun
Core Features
Can read screen content
Can interact with device
Continues to work when UI changes
Extract data using natural language
Framework Standards
Easy to setup
Self-healing
Open source
Infrastructure
Stealth & anti-detection
Hosted device fleet
No setup required

from zero to agent in 60 seconds

Three commands. One device. Full automation.

terminal
$ pip install droidrun
Successfully installed droidrun-0.5.8

$ droidrun setup
Portal is installed and accessible. You're good to go!

$ droidrun run "Open Settings and check battery level"

[agent] planning steps...
[step 1/2] tap "Settings" on home screen
[step 2/2] navigate to "Battery"

[done] task completed in 3.8s -- 2 actions, 0 retries

frequently asked questions

How is droidrun different from using Claude/GPT with computer use?+

Generic LLMs send full screenshots to the model (~1MB per frame). droidrun uses the accessibility tree (~2KB) — that’s 500x smaller, 10x faster, and dramatically cheaper. Plus droidrun has Action Replay (record once, replay without LLM costs). A generic agent explores blindly. droidrun knows what it’s doing.

How is droidrun different from Appium?+

Appium requires writing test scripts with exact element selectors that break when the UI changes. droidrun is AI-native — describe what you want in natural language and the agent figures it out. It self-heals when UI changes, works on real devices, and supports both Android and iOS. Users report replacing entire Appium suites in a weekend.

Which LLMs does droidrun support?+

droidrun is fully LLM-agnostic. Works with OpenAI (GPT-4, GPT-4o), Anthropic (Claude Sonnet, Opus), Google (Gemini 2, Gemini 3), DeepSeek, Ollama (local models), OpenRouter, and Qwen. Switch models with a single config change.

What is the difference between droidrun and mobilerun?+

droidrun is the open-source framework (the agent brain) — install with pip, run on your own devices, full control. mobilerun is the cloud platform (the device infrastructure) — hosted real Android and iOS phones with stealth mode, residential proxies, and fleet management. Start local with droidrun, scale to the cloud with mobilerun. Same agent protocol.

Is droidrun free?+

Yes — droidrun is open-source under the MIT license. pip install droidrun and you’re running in 30 seconds. For hosted cloud devices with stealth mode and residential proxies, mobilerun plans start at $5/month with a free tier for OpenClaw users.

start building today.

Clone the repo, run your first automation, and scale to production on mobilerun.ai when you're ready.