Detect taps vs drags in the overlay touch handler — if movement is
under 10px, launch MainActivity with REORDER_TO_FRONT so tapping
the pill from any other app brings DroidClaw back into focus.
- Replace default purple Material 3 theme with logo-derived palette
(crimson red primary, dark charcoal surfaces, golden accent)
- Add Instrument Serif + Inter fonts for custom typography scale
- Add first-launch onboarding flow (API key + server URL, permissions)
- Restructure navigation: 2-tab bottom nav (Home + Settings), logs
via top bar icon, permission status indicators in top bar
- Redesign HomeScreen as chat-style interface with goal/step bubbles,
auto-scroll, bottom input bar with send/stop buttons
- Redesign SettingsScreen with Server → Connection → Permissions sections
- Redesign LogsScreen with goal banner, step badges, timestamps
- Replace default Android icon with DroidClaw logo at all densities
- Add hasOnboarded DataStore flag with auto-connect on completion
- Fix logs navigation not clearing when switching bottom tabs
Add workflow system that lets users describe automations in natural
language through the same input field. The server LLM classifies
input as either an immediate goal or a workflow rule, then:
- Parses workflow descriptions into structured trigger conditions
- Stores workflows per-user in Postgres
- Syncs workflows to device via WebSocket
- NotificationListenerService monitors notifications and triggers
matching workflows as agent goals
Also cleans up overlay text and adds network security config.
CIO uses Java NIO selectors which Android's power management freezes
when the app is backgrounded, dropping the WebSocket connection. OkHttp
is Android-native and maintains connections through the OS network stack.
The agent loop checked signal.aborted only at the top of each iteration,
but the LLM fetch() call (which takes seconds) never received the signal.
Now the signal is passed to fetch() and checked after LLM errors and
before the inter-step sleep, so aborting takes effect mid-step.
- Add draggable agent overlay pill (status dot + step text + stop button)
that shows over other apps while connected. Fix ComposeView rendering
in service context by providing a SavedStateRegistryOwner.
- Add stop_goal protocol message so the overlay/client can abort a
running agent session; server aborts via AbortController.
- Persist screen-capture consent to SharedPreferences so it survives
process death; restore on ConnectionService connect and Settings resume.
- Query AccessibilityManager for real service state instead of relying
on in-process MutableStateFlow that resets on restart.
- Add overlay permission checklist item and SYSTEM_ALERT_WINDOW manifest
entry.
- Filter DroidClaw's own overlay nodes from the accessibility tree so the
agent never interacts with them.
Railway proxy closes idle DB connections after ~60s, causing
CONNECTION_CLOSED errors on stale sockets. Set idle_timeout=20s and
max_lifetime=5m so postgres-js recycles connections before they die.
Also fix sendCommand to fall back to persistent device ID on reconnect.
Cookie forwarding between dash.droidclaw.ai and tunnel.droidclaw.ai was
unreliable. Now the web app passes userId + shared internal secret via
headers. Also removes debug logging from device auth and session middleware.
DataStore flow re-emissions were resetting editingApiKey via remember(apiKey),
hiding the save button mid-edit. Now uses null sentinel to track edit state
independently from stored value.
Reverts middleware and dashboard WS to direct DB session lookups.
Replaces auth.api.verifyApiKey in device WS with direct DB query
using SHA-256 hash matching, removing dependency on BETTER_AUTH_SECRET
for auth validation.
Swipe coordinates were hardcoded for 1080x2400 screens, causing scrolls
to fail on devices with different resolutions. Now reads screenWidth and
screenHeight from DeviceInfo and computes coordinates proportionally.
readme rewritten with ascii diagrams, detailed setup, and conversational tone.
root package.json updated with packageManager field for bun, web workspace,
and build/start scripts pointing to web/ for railway railpack compatibility.
- Make screen capture permission and battery optimization states reactive
- Add lifecycle observer to refresh battery status on resume
- Add lifecycle-runtime-compose dependency for non-deprecated LocalLifecycleOwner
- Replace deprecated LocalLifecycleOwner import
- Remove unused REQUEST_CODE constant
- Use KTX createBitmap() and bitmap[x,y] extensions
- Add misc.xml and junie.xml to .gitignore
Skills (copy_visible_text, find_and_tap, submit_message, read_screen,
wait_for_content, compose_email) were CLI-only using direct ADB. The
server prompt advertised them but they silently failed when chosen.
Now intercepted in the agent loop before actionToCommand() and executed
server-side using existing WebSocket primitives (get_screen, tap, swipe,
clipboard_set). Each skill replaces 3-8 LLM calls with deterministic
server-side logic.
The UI agent had no memory of previous actions — each step was a fresh
single-shot LLM call. After typing and sending a message, the LLM saw
an empty text field and retyped the message in a loop.
- Add RECENT_ACTIONS (last 5 actions with text/result) to user prompt
- Add chat app completion detection rule to dynamic prompt
- Add send-success hints for WhatsApp and Messages apps
- Add git convention to CLAUDE.md (no co-author lines)
- Add empty goal guard in parser (returns done instead of passthrough)
- Replace `as any` casts in pipeline.ts with proper ActionDecision types
- Add runtime type guards for untrusted LLM output in classifier
- Add intent action to dynamic prompt so UI agent can fire intents
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace preprocessor+runAgentLoop with runPipeline in both device.ts
(WebSocket) and goals.ts (REST). The pipeline orchestrates: deterministic
parser (stage 1) -> LLM classifier (stage 2) -> lean UI agent (stage 3).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When pipelineMode is enabled in AgentLoopOptions, the loop uses
buildDynamicPrompt() with per-screen context (editable fields,
scrollable elements, app hints, stuck state) instead of the static
mega-prompt. Legacy mode (default) is unchanged.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Android: fetch installed apps via PackageManager, send to server on connect
- Android: add QUERY_ALL_PACKAGES permission for full app visibility
- Android: fix duplicate Intent import, increase accessibility retry window
- Android: default server URL to ws:// instead of wss://
- Server: store installed apps in device metadata JSONB
- Server: inject installed apps context into LLM prompt
- Server: preprocessor resolves app names from device's actual installed apps
- Server: add POST /goals/stop endpoint with AbortController cancellation
- Server: rewrite session middleware to direct DB token lookup
- Server: goals route fetches user's saved LLM config from DB
- Web: show installed apps in device detail Overview tab with search
- Web: add Stop button for running goals
- Web: replace API routes with remote commands (submitGoal, stopGoal)
- Web: add error display for goal submission failures
- Shared: add InstalledApp type and apps message to protocol