find_and_tap now scrolls down and rescans (up to 10 times) when the
target element isn't visible on the current screen. Stops as soon as
the element is found — no wasted scrolls. This removes the need for
LLMs to manually scroll-and-check in workflow prompts.
Also simplifies the Gemini-to-WhatsApp workflow prompts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Multi-app workflow: ask Gemini about droidclaw.ai, copy response,
switch to WhatsApp, find contact, paste and send.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New --flow mode executes scripted YAML steps without LLM, mapping 17
commands (tap, type, swipe, scroll, etc.) to existing actions. Element
finding uses accessibility tree text/hint/id matching.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New actions: open_url, switch_app, notifications, pull_file, push_file, keyevent, open_settings
- Workflow system: runWorkflow() for multi-app sub-goal sequences with --workflow CLI flag
- Export runAgent() with {success, stepsUsed} return for workflow integration
- Fix clipboard_set shell escaping (single-quote wrapping matching skills.ts)
- Improve type action escaping for backticks, $, !, ?, brackets, braces
- Move parseJsonResponse to llm-providers.ts and export it
- Update SYSTEM_PROMPT and Zod schema for 22 total actions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>