- Add device/session/step DB persistence in server agent loop
- Add goal preprocessor for compound goals (e.g., "open YouTube and search X")
- Add step-level logging to agent loop
- Fix dashboard WebSocket auth (direct DB token lookup instead of auth.api)
- Fix web layout to use locals.session.token instead of cookie
- Add dashboard-ws.svelte.ts WebSocket store with auto-reconnect
- Rewrite devices page with direct DB queries and real-time updates
- Add device detail page with live step display and session history
- Add Android companion app resources, themes, and screen capture consent
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
find_and_tap now scrolls down and rescans (up to 10 times) when the
target element isn't visible on the current screen. Stops as soon as
the element is found — no wasted scrolls. This removes the need for
LLMs to manually scroll-and-check in workflow prompts.
Also simplifies the Gemini-to-WhatsApp workflow prompts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Multi-app workflow: ask Gemini about droidclaw.ai, copy response,
switch to WhatsApp, find contact, paste and send.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New --flow mode executes scripted YAML steps without LLM, mapping 17
commands (tap, type, swipe, scroll, etc.) to existing actions. Element
finding uses accessibility tree text/hint/id matching.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New actions: open_url, switch_app, notifications, pull_file, push_file, keyevent, open_settings
- Workflow system: runWorkflow() for multi-app sub-goal sequences with --workflow CLI flag
- Export runAgent() with {success, stepsUsed} return for workflow integration
- Fix clipboard_set shell escaping (single-quote wrapping matching skills.ts)
- Improve type action escaping for backticks, $, !, ?, brackets, braces
- Move parseJsonResponse to llm-providers.ts and export it
- Update SYSTEM_PROMPT and Zod schema for 22 total actions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>