AndroidWorld Benchmark

mobilerun framework (formerly droidrun) achieves state of the art performance on AndroidWorld with 91.4% success rate across 116 diverse tasks. Despite setup challenges and evaluation limitations, our open source approach using GPT-5 and direct Accessibility API access outperforms all competing agents.

AndroidWorld Benchmark Results

Success rates of leading AI agents on the 116-task AndroidWorld benchmark (03.10.2025)

35%50%65%91.4%91.4%mobilerun framework(fmr. droidrun)66.4%GUI-Owl-7B67.2%JT-GUIAgent-V273.3%Mobile-Agent-v375%MobileUse-v276.7%Finalrun76.7%K²-Agent79.3%LX-GUIAgent80.2%AutoGLM-Mobile84.5%minitap-ai/mobile-use
mobilerun framework
(fmr. droidrun)
Other Agents
91.4% Success Rate
AndroidWorld benchmark results

task results

106 out of 115 tasks completed successfully

Successful Tasks

106

read the methodology.

Learn how the Manager-Executor architecture with dynamic feedback loops achieves state-of-the-art results.