Compare commits

24 Commits
v0.3.1 ... main

Author SHA1 Message Date
Somasundaram Mahesh
a30341516f feat(android): auto-connect, assistant invocation, suggestion cards, and onboarding assistant step
- Auto-connect to server on app open when API key is configured
- Show error card below top bar on connection errors
- Fix VoiceInteractionService registration with RecognitionService stub
- Voice session triggers overlay command panel instead of separate UI
- Add suggestion cards (recent goals + defaults) to HomeScreen empty state
- Add digital assistant setup step to onboarding with skip option
- Add ACTION_SHOW_COMMAND_PANEL to ConnectionService and AgentOverlay

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 06:54:00 +05:30
Somasundaram Mahesh
474395e8c4 feat(android): add overlay command panel, dismiss target, vignette, voice integration, and theme updates
- Add CommandPanelOverlay with suggestion cards and text input for goals
- Add DismissTargetView for drag-to-dismiss floating pill
- Add VignetteOverlay for crimson glow during agent execution
- Integrate voice mic button in command panel
- Add VoiceInteractionService for system assistant registration
- Store recent goals in DataStore for command panel suggestions
- Update GradientBorder and VoiceOverlayContent to DroidClaw crimson/golden theme
- Fix default assistant settings to use ACTION_VOICE_INPUT_SETTINGS
- Merge upstream voice overlay architecture with local overlay features
2026-02-20 06:23:00 +05:30
Sanju Sivalingam
2411f47914 chore: remove CLAUDE.md from tracking and add to gitignore 2026-02-20 03:01:01 +05:30
Sanju Sivalingam
a42b5b08f4 fix: address critical review issues in voice overlay
- Clean up voice sessions on WebSocket disconnect (prevents timer leak)
- Guard against missing LLM config in voice_stop send path
- Return overlay to idle on goal_failed (prevents stuck UI)
2026-02-20 02:16:39 +05:30
Sanju Sivalingam
1f47a990cc feat(android): add RECORD_AUDIO runtime permission handling 2026-02-20 02:12:38 +05:30
Sanju Sivalingam
16f581f479 feat(android): wire voice recording and transcript into ConnectionService 2026-02-20 02:11:41 +05:30
Sanju Sivalingam
07f608a901 feat(android): expand AgentOverlay with voice mode state machine 2026-02-20 02:10:00 +05:30
Sanju Sivalingam
7b685b1b0f feat(android): add voice overlay UI with transcript and action buttons 2026-02-20 02:08:00 +05:30
Sanju Sivalingam
2c10e61390 feat(android): add animated gradient border composable 2026-02-20 02:07:02 +05:30
Sanju Sivalingam
36ffb15f39 feat(android): add VoiceRecorder with AudioRecord PCM streaming 2026-02-20 02:06:10 +05:30
Sanju Sivalingam
2986766d41 feat(android): add voice protocol models and overlay mode enum 2026-02-20 02:05:06 +05:30
Sanju Sivalingam
3522b66b02 feat(server): wire voice messages into device handler 2026-02-20 01:59:50 +05:30
Sanju Sivalingam
63276d3573 feat(server): add voice session handler with Groq Whisper STT 2026-02-20 01:57:53 +05:30
Sanju Sivalingam
4a128f7719 feat(shared): add voice overlay protocol types 2026-02-20 01:55:40 +05:30
Sanju Sivalingam
669aa3d9b1 docs: add voice overlay implementation plan
11 tasks covering shared protocol types, server Groq Whisper STT handler,
Android VoiceRecorder, gradient border overlay, voice panel UI, AgentOverlay
state machine, ConnectionService wiring, and permission handling.
2026-02-20 01:52:40 +05:30
Sanju Sivalingam
eae221b904 docs: add voice overlay design document
Design for voice-activated overlay feature — tap floating pill to activate
voice mode, stream audio to server for Groq Whisper STT, show live
transcription on screen with glowing gradient border, send as goal.
2026-02-20 01:46:49 +05:30
Sanju Sivalingam
fcda17109b update readme 2026-02-19 08:23:25 +05:30
Somasundaram Mahesh
e1bc16397e Revert "feat(android): make battery optimization permission optional"
This reverts commit 795e0299fa.
2026-02-19 00:56:28 +05:30
Somasundaram Mahesh
795e0299fa feat(android): make battery optimization permission optional
Remove battery exemption from required permissions gate in onboarding
and permission status bar — it is now informational only and does not
block "Get Started" or the all-permissions indicator.

Bump download links to v0.3.2.
2026-02-19 00:34:12 +05:30
Sanju Sivalingam
0b5a447c4d feat: replace get started button with open dashboard link 2026-02-19 00:19:02 +05:30
Sanju Sivalingam
d35d685c3f feat: add APK download links and launch banner across site, web, and README 2026-02-18 23:56:38 +05:30
Sanju Sivalingam
a3a50539be chore: trigger deployment 2026-02-18 23:41:51 +05:30
Sanju Sivalingam
b3ade24e38 fix: update email sender name and switch to production Polar checkout link 2026-02-18 23:33:54 +05:30
Sanju Sivalingam
ce6d1e320b fix: handle Polar activation limit gracefully + switch checkout to command pattern
- Server: wrap licenseKeys.activate() in try/catch — if activation limit
  reached, treat as already-activated and proceed to store the key
- Web: switch activateFromCheckout from form() to command() pattern for
  programmatic invocation with proper error handling
- Activate page: auto-fires on mount, shows spinner/error states, retry button
2026-02-18 23:28:19 +05:30
38 changed files with 4047 additions and 311 deletions

1
.gitignore vendored
View File

@@ -12,3 +12,4 @@ docs/architecture-web-flow.md
docs/INTENT.md
OPTION1-IMPLEMENTATION.md
HOSTED-PLAN.md
CLAUDE.md

View File

@@ -1,69 +0,0 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
DroidClaw — an AI agent that controls Android devices through the Accessibility API. It runs a Perception → Reasoning → Action loop: captures the screen state via `uiautomator dump`, sends it to an LLM for decision-making, and executes the chosen action via ADB.
**Runtime:** Bun (TypeScript, ES2022 modules). Bun natively loads `.env` files — no dotenv needed.
## Commands
All commands run from the project root:
```bash
bun install # Install dependencies
bun run src/kernel.ts # Start the agent (interactive, prompts for goal)
bun run build # Compile to dist/ (bun build --target bun)
bun run typecheck # Type-check only (tsc --noEmit)
```
There are no tests currently.
## Architecture
Seven source files in `src/`, no subdirectories:
- **kernel.ts** — Entry point and main agent loop. Reads goal from stdin, runs up to MAX_STEPS iterations of: capture screen → diff with previous → call LLM → execute action → track history. Handles stuck-loop detection and vision fallback when the accessibility tree is empty.
- **actions.ts** — 15 action implementations (tap, type, enter, swipe, home, back, wait, done, longpress, screenshot, launch, clear, clipboard_get, clipboard_set, shell). Each wraps ADB commands via `Bun.spawnSync()`. `runAdbCommand()` provides exponential backoff retry.
- **llm-providers.ts** — LLM abstraction with `LLMProvider` interface and factory (`getLlmProvider()`). Five providers: OpenAI, Groq (OpenAI-compatible endpoint), Ollama (local LLMs, OpenAI-compatible), AWS Bedrock (Anthropic + Meta model formats), OpenRouter (Vercel AI SDK). Contains the full SYSTEM_PROMPT with all 15 action definitions and rules.
- **sanitizer.ts** — Parses Android Accessibility XML (via `fast-xml-parser`) into `UIElement[]`. Depth-first walk extracting bounds, center coordinates, state flags (enabled, checked, focused, etc.), and parent context. `computeScreenHash()` used for stuck-loop detection.
- **config.ts** — Singleton `Config` object reading from `process.env` with defaults from constants. `Config.validate()` checks required API keys at startup.
- **constants.ts** — All magic values: ADB keycodes, swipe coordinates (hardcoded for 1080px-wide screens), default models, file paths, agent defaults.
## Key Patterns
- **Provider factory:** `getLlmProvider()` returns the appropriate `LLMProvider` based on `Config.LLM_PROVIDER`. Groq and Ollama reuse the `OpenAIProvider` class with different base URLs.
- **Screen state diffing:** Hash-based comparison (id + text + center + state). After STUCK_THRESHOLD unchanged steps, recovery hints are injected into the LLM prompt.
- **Vision fallback:** When `getInteractiveElements()` returns empty (custom UI, WebView, Flutter), a screenshot is captured and the LLM gets a fallback context suggesting coordinate-based taps.
- **LLM response parsing:** `parseJsonResponse()` handles both clean JSON and markdown-wrapped code blocks. Falls back to "wait" action on parse failure.
- **Long press via swipe:** Implemented as `input swipe x y x y 1000` (swipe from point to same point with long duration).
- **Text escaping for ADB:** Spaces become `%s`, shell metacharacters are backslash-escaped in `executeType()`.
## Adding a New LLM Provider
1. Implement `LLMProvider` interface in `llm-providers.ts`
2. Add case to `getLlmProvider()` factory
3. Add config fields to `config.ts` and env vars to `.env.example`
## Adding a New Action
1. Add fields to `ActionDecision` interface in `actions.ts`
2. Implement `executeNewAction()` function
3. Add case to `executeAction()` switch
4. Document the action JSON format in `SYSTEM_PROMPT` in `llm-providers.ts`
## Environment Setup
Requires: Bun 1.0+, ADB (Android SDK Platform Tools) in PATH, an Android device connected via USB/WiFi with accessibility enabled, and either a local Ollama install or an API key for a cloud LLM provider (Groq, OpenAI, Bedrock, or OpenRouter).
Copy `.env.example` to `.env` and configure `LLM_PROVIDER` + the corresponding API key.
## Device Assumptions
Swipe coordinates in `constants.ts` are hardcoded for 1080px-wide screens (center X=540, center Y=1200). Adjust `SWIPE_COORDS` and `SCREEN_CENTER_*` for different resolutions.
## Git Conventions
- Do NOT add `Co-Authored-By: Claude` lines to commit messages.

View File

@@ -2,6 +2,8 @@
> an ai agent that controls your android phone. give it a goal in plain english — it figures out what to tap, type, and swipe.
**[Download Android APK (v0.3.1)](https://github.com/unitedbyai/droidclaw/releases/download/v0.3.1/app-debug.apk)** | **[Dashboard](https://app.droidclaw.ai)** | **[Discord](https://discord.gg/nRHKQ29j)**
i wanted to turn my old android devices into ai agents. after a few hours reverse engineering accessibility trees and playing with tailscale.. it worked.
think of it this way — a few years back, we could automate android with predefined flows. now imagine that automation layer has an llm brain. it can read any screen, understand what's happening, decide what to do, and execute. you don't need api's. you don't need to build integrations. just install your favourite apps and tell the agent what you want done.
@@ -497,10 +499,6 @@ built by [unitedby.ai](https://unitedby.ai) — an open ai community
- [sanju sivalingam](https://sanju.sh)
- [somasundaram mahesh](https://msomu.com)
## acknowledgements
droidclaw's workflow orchestration was influenced by [android action kernel](https://github.com/Action-State-Labs/android-action-kernel) from action state labs. we took the core idea of sub-goal decomposition and built a different system around it — with stuck recovery, 28 actions, multi-step skills, and vision fallback.
## license
mit

View File

@@ -12,6 +12,7 @@
<uses-permission android:name="android.permission.QUERY_ALL_PACKAGES"
tools:ignore="QueryAllPackagesPermission" />
<uses-permission android:name="android.permission.SYSTEM_ALERT_WINDOW" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<application
android:name=".DroidClawApp"
@@ -27,13 +28,17 @@
<activity
android:name=".MainActivity"
android:exported="true"
android:launchMode="singleTop"
android:label="@string/app_name"
android:theme="@style/Theme.DroidClaw">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
<intent-filter>
<action android:name="android.intent.action.ASSIST" />
<category android:name="android.intent.category.DEFAULT" />
</intent-filter>
</activity>
<service
@@ -53,6 +58,33 @@
android:foregroundServiceType="connectedDevice"
android:exported="false" />
<service
android:name=".voice.DroidClawVoiceInteractionService"
android:label="@string/app_name"
android:permission="android.permission.BIND_VOICE_INTERACTION"
android:exported="true">
<intent-filter>
<action android:name="android.service.voice.VoiceInteractionService" />
</intent-filter>
<meta-data
android:name="android.voice_interaction"
android:resource="@xml/voice_interaction_service" />
</service>
<service
android:name=".voice.DroidClawVoiceSessionService"
android:permission="android.permission.BIND_VOICE_INTERACTION"
android:exported="true" />
<service
android:name=".voice.DroidClawRecognitionService"
android:permission="android.permission.BIND_VOICE_INTERACTION"
android:exported="true">
<intent-filter>
<action android:name="android.speech.RecognitionService" />
</intent-filter>
</service>
</application>
</manifest>

View File

@@ -1,9 +1,17 @@
package com.thisux.droidclaw
import android.Manifest
import android.content.Intent
import android.os.Bundle
import android.provider.Settings
import androidx.activity.ComponentActivity
import androidx.lifecycle.lifecycleScope
import kotlinx.coroutines.flow.first
import kotlinx.coroutines.launch
import androidx.activity.compose.setContent
import androidx.activity.enableEdgeToEdge
import androidx.activity.result.contract.ActivityResultContracts
import com.thisux.droidclaw.connection.ConnectionService
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.padding
import androidx.compose.material.icons.Icons
@@ -32,12 +40,22 @@ import androidx.navigation.compose.composable
import androidx.navigation.compose.currentBackStackEntryAsState
import androidx.navigation.compose.rememberNavController
import com.thisux.droidclaw.ui.components.PermissionStatusBar
import com.thisux.droidclaw.model.ConnectionState
import com.thisux.droidclaw.ui.screens.HomeScreen
import com.thisux.droidclaw.ui.screens.LogsScreen
import com.thisux.droidclaw.ui.screens.OnboardingScreen
import com.thisux.droidclaw.ui.screens.SettingsScreen
import com.thisux.droidclaw.ui.theme.DroidClawTheme
import com.thisux.droidclaw.ui.theme.InstrumentSerif
import com.thisux.droidclaw.ui.theme.StatusRed
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.ui.draw.clip
import androidx.compose.ui.unit.dp
sealed class Screen(val route: String, val label: String) {
data object Home : Screen("home", "Home")
@@ -47,6 +65,12 @@ sealed class Screen(val route: String, val label: String) {
}
class MainActivity : ComponentActivity() {
private val audioPermissionLauncher = registerForActivityResult(
ActivityResultContracts.RequestPermission()
) { _ ->
// Permission result handled — user can tap overlay pill again
}
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
enableEdgeToEdge()
@@ -55,6 +79,39 @@ class MainActivity : ComponentActivity() {
MainNavigation()
}
}
if (intent?.getBooleanExtra("request_audio_permission", false) == true) {
audioPermissionLauncher.launch(Manifest.permission.RECORD_AUDIO)
}
autoConnectIfNeeded()
}
override fun onNewIntent(intent: Intent) {
super.onNewIntent(intent)
if (intent.getBooleanExtra("request_audio_permission", false)) {
audioPermissionLauncher.launch(Manifest.permission.RECORD_AUDIO)
}
}
override fun onResume() {
super.onResume()
val service = ConnectionService.instance ?: return
if (Settings.canDrawOverlays(this)) {
service.overlay?.show()
}
}
private fun autoConnectIfNeeded() {
if (ConnectionService.connectionState.value != com.thisux.droidclaw.model.ConnectionState.Disconnected) return
val app = application as DroidClawApp
lifecycleScope.launch {
val apiKey = app.settingsStore.apiKey.first()
if (apiKey.isNotBlank()) {
val intent = Intent(this@MainActivity, ConnectionService::class.java).apply {
action = ConnectionService.ACTION_CONNECT
}
startForegroundService(intent)
}
}
}
}
@@ -154,10 +211,31 @@ fun MainNavigation() {
) { innerPadding ->
val startDestination = if (hasOnboarded) Screen.Home.route else Screen.Onboarding.route
val connectionState by ConnectionService.connectionState.collectAsState()
val errorMessage by ConnectionService.errorMessage.collectAsState()
Column(modifier = Modifier.padding(innerPadding)) {
if (showChrome && connectionState == ConnectionState.Error) {
Box(
modifier = Modifier
.fillMaxWidth()
.padding(horizontal = 16.dp, vertical = 4.dp)
.clip(RoundedCornerShape(8.dp))
.background(StatusRed.copy(alpha = 0.15f))
.padding(horizontal = 12.dp, vertical = 8.dp)
) {
Text(
text = errorMessage ?: "Connection error",
style = MaterialTheme.typography.bodySmall,
color = StatusRed
)
}
}
NavHost(
navController = navController,
startDestination = startDestination,
modifier = Modifier.padding(innerPadding)
modifier = Modifier.weight(1f)
) {
composable(Screen.Onboarding.route) {
OnboardingScreen(
@@ -173,4 +251,5 @@ fun MainNavigation() {
composable(Screen.Logs.route) { LogsScreen() }
}
}
}
}

View File

@@ -12,6 +12,7 @@ import com.thisux.droidclaw.model.PongMessage
import com.thisux.droidclaw.model.ResultResponse
import com.thisux.droidclaw.model.ScreenResponse
import com.thisux.droidclaw.model.ServerMessage
import kotlinx.coroutines.delay
import kotlinx.coroutines.flow.MutableStateFlow
class CommandRouter(
@@ -27,6 +28,10 @@ class CommandRouter(
val currentGoal = MutableStateFlow("")
val currentSessionId = MutableStateFlow<String?>(null)
// Called before/after screen capture to hide/show overlays that would pollute the agent's view
var beforeScreenCapture: (() -> Unit)? = null
var afterScreenCapture: (() -> Unit)? = null
private var gestureExecutor: GestureExecutor? = null
fun updateGestureExecutor() {
@@ -62,12 +67,24 @@ class CommandRouter(
currentSteps.value = currentSteps.value + step
Log.d(TAG, "Step ${step.step}: ${step.reasoning}")
}
"transcript_partial" -> {
ConnectionService.overlayTranscript.value = msg.text ?: ""
ConnectionService.instance?.overlay?.updateTranscript(msg.text ?: "")
Log.d(TAG, "Transcript partial: ${msg.text}")
}
"transcript_final" -> {
ConnectionService.overlayTranscript.value = msg.text ?: ""
ConnectionService.instance?.overlay?.updateTranscript(msg.text ?: "")
Log.d(TAG, "Transcript final: ${msg.text}")
}
"goal_completed" -> {
currentGoalStatus.value = if (msg.success == true) GoalStatus.Completed else GoalStatus.Failed
ConnectionService.instance?.overlay?.returnToIdle()
Log.i(TAG, "Goal completed: success=${msg.success}, steps=${msg.stepsUsed}")
}
"goal_failed" -> {
currentGoalStatus.value = GoalStatus.Failed
ConnectionService.instance?.overlay?.returnToIdle()
Log.i(TAG, "Goal failed: ${msg.message}")
}
@@ -75,7 +92,7 @@ class CommandRouter(
}
}
private fun handleGetScreen(requestId: String) {
private suspend fun handleGetScreen(requestId: String) {
updateGestureExecutor()
val svc = DroidClawAccessibilityService.instance
val elements = svc?.getScreenTree() ?: emptyList()
@@ -85,7 +102,11 @@ class CommandRouter(
var screenshot: String? = null
if (elements.isEmpty()) {
// Hide overlays so the agent gets a clean screenshot
beforeScreenCapture?.invoke()
delay(150) // wait for virtual display to render a clean frame
val bytes = captureManager?.capture()
afterScreenCapture?.invoke()
if (bytes != null) {
screenshot = Base64.encodeToString(bytes, Base64.NO_WRAP)
}

View File

@@ -30,6 +30,11 @@ import android.net.Uri
import android.provider.Settings
import com.thisux.droidclaw.model.StopGoalMessage
import com.thisux.droidclaw.overlay.AgentOverlay
import com.thisux.droidclaw.model.VoiceStartMessage
import com.thisux.droidclaw.model.VoiceChunkMessage
import com.thisux.droidclaw.model.VoiceStopMessage
import com.thisux.droidclaw.model.OverlayMode
import androidx.compose.runtime.snapshotFlow
import kotlinx.coroutines.delay
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.first
@@ -47,11 +52,13 @@ class ConnectionService : LifecycleService() {
val currentGoalStatus = MutableStateFlow(GoalStatus.Idle)
val currentGoal = MutableStateFlow("")
val errorMessage = MutableStateFlow<String?>(null)
val overlayTranscript = MutableStateFlow("")
var instance: ConnectionService? = null
const val ACTION_CONNECT = "com.thisux.droidclaw.CONNECT"
const val ACTION_DISCONNECT = "com.thisux.droidclaw.DISCONNECT"
const val ACTION_SEND_GOAL = "com.thisux.droidclaw.SEND_GOAL"
const val ACTION_SHOW_COMMAND_PANEL = "com.thisux.droidclaw.SHOW_COMMAND_PANEL"
const val EXTRA_GOAL = "goal_text"
}
@@ -59,13 +66,31 @@ class ConnectionService : LifecycleService() {
private var commandRouter: CommandRouter? = null
private var captureManager: ScreenCaptureManager? = null
private var wakeLock: PowerManager.WakeLock? = null
private var overlay: AgentOverlay? = null
internal var overlay: AgentOverlay? = null
override fun onCreate() {
super.onCreate()
instance = this
createNotificationChannel()
overlay = AgentOverlay(this)
overlay?.onAudioChunk = { base64 ->
webSocket?.sendTyped(VoiceChunkMessage(data = base64))
}
overlay?.onVoiceSend = { _ ->
webSocket?.sendTyped(VoiceStopMessage(action = "send"))
}
overlay?.onVoiceCancel = {
webSocket?.sendTyped(VoiceStopMessage(action = "cancel"))
}
overlay?.let { ov ->
lifecycleScope.launch {
snapshotFlow { ov.mode.value }.collect { mode ->
if (mode == OverlayMode.Listening) {
webSocket?.sendTyped(VoiceStartMessage())
}
}
}
}
}
override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
@@ -84,6 +109,9 @@ class ConnectionService : LifecycleService() {
val goal = intent.getStringExtra(EXTRA_GOAL) ?: return START_NOT_STICKY
sendGoal(goal)
}
ACTION_SHOW_COMMAND_PANEL -> {
overlay?.showCommandPanel()
}
}
return START_NOT_STICKY
@@ -123,6 +151,13 @@ class ConnectionService : LifecycleService() {
webSocket = ws
val router = CommandRouter(ws, captureManager)
router.beforeScreenCapture = { overlay?.hideVignette() }
router.afterScreenCapture = {
if (currentGoalStatus.value == GoalStatus.Running &&
Settings.canDrawOverlays(this@ConnectionService)) {
overlay?.showVignette()
}
}
commandRouter = router
launch {
@@ -149,7 +184,24 @@ class ConnectionService : LifecycleService() {
}
launch { ws.errorMessage.collect { errorMessage.value = it } }
launch { router.currentSteps.collect { currentSteps.value = it } }
launch { router.currentGoalStatus.collect { currentGoalStatus.value = it } }
launch {
router.currentGoalStatus.collect { status ->
currentGoalStatus.value = status
if (status == GoalStatus.Running) {
if (Settings.canDrawOverlays(this@ConnectionService)) {
overlay?.showVignette()
}
} else {
overlay?.hideVignette()
}
if (status == GoalStatus.Completed) {
val goal = router.currentGoal.value
if (goal.isNotBlank()) {
(application as DroidClawApp).settingsStore.addRecentGoal(goal)
}
}
}
}
launch { router.currentGoal.collect { currentGoal.value = it } }
acquireWakeLock()
@@ -182,6 +234,7 @@ class ConnectionService : LifecycleService() {
}
private fun disconnect() {
overlay?.hideVignette()
overlay?.hide()
webSocket?.disconnect()
webSocket = null

View File

@@ -9,6 +9,7 @@ import androidx.datastore.preferences.core.stringPreferencesKey
import androidx.datastore.preferences.preferencesDataStore
import kotlinx.coroutines.flow.Flow
import kotlinx.coroutines.flow.map
import org.json.JSONArray
val Context.dataStore: DataStore<Preferences> by preferencesDataStore(name = "settings")
@@ -18,6 +19,7 @@ object SettingsKeys {
val DEVICE_NAME = stringPreferencesKey("device_name")
val AUTO_CONNECT = booleanPreferencesKey("auto_connect")
val HAS_ONBOARDED = booleanPreferencesKey("has_onboarded")
val RECENT_GOALS = stringPreferencesKey("recent_goals")
}
class SettingsStore(private val context: Context) {
@@ -61,4 +63,25 @@ class SettingsStore(private val context: Context) {
suspend fun setHasOnboarded(value: Boolean) {
context.dataStore.edit { it[SettingsKeys.HAS_ONBOARDED] = value }
}
val recentGoals: Flow<List<String>> = context.dataStore.data.map { prefs ->
val json = prefs[SettingsKeys.RECENT_GOALS] ?: "[]"
try {
JSONArray(json).let { arr ->
(0 until arr.length()).map { arr.getString(it) }
}
} catch (_: Exception) { emptyList() }
}
suspend fun addRecentGoal(goal: String) {
context.dataStore.edit { prefs ->
val current = try {
JSONArray(prefs[SettingsKeys.RECENT_GOALS] ?: "[]").let { arr ->
(0 until arr.length()).map { arr.getString(it) }
}
} catch (_: Exception) { emptyList() }
val updated = (listOf(goal) + current.filter { it != goal }).take(5)
prefs[SettingsKeys.RECENT_GOALS] = JSONArray(updated).toString()
}
}
}

View File

@@ -28,3 +28,9 @@ data class GoalSession(
val status: GoalStatus,
val timestamp: Long = System.currentTimeMillis()
)
enum class OverlayMode {
Idle,
Listening,
Executing
}

View File

@@ -76,6 +76,23 @@ data class StopGoalMessage(
val type: String = "stop_goal"
)
@Serializable
data class VoiceStartMessage(
val type: String = "voice_start"
)
@Serializable
data class VoiceChunkMessage(
val type: String = "voice_chunk",
val data: String
)
@Serializable
data class VoiceStopMessage(
val type: String = "voice_stop",
val action: String
)
@Serializable
data class ServerMessage(
val type: String,

View File

@@ -6,20 +6,27 @@ import android.view.Gravity
import android.view.MotionEvent
import android.view.View
import android.view.WindowManager
import androidx.compose.runtime.mutableStateOf
import androidx.compose.ui.platform.ComposeView
import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleService
import androidx.lifecycle.lifecycleScope
import androidx.lifecycle.setViewTreeLifecycleOwner
import androidx.savedstate.SavedStateRegistry
import androidx.savedstate.SavedStateRegistryController
import androidx.savedstate.SavedStateRegistryOwner
import androidx.savedstate.setViewTreeSavedStateRegistryOwner
import com.thisux.droidclaw.MainActivity
import com.thisux.droidclaw.connection.ConnectionService
import com.thisux.droidclaw.model.GoalStatus
import com.thisux.droidclaw.model.OverlayMode
import com.thisux.droidclaw.ui.theme.DroidClawTheme
class AgentOverlay(private val service: LifecycleService) {
private val windowManager = service.getSystemService(WindowManager::class.java)
private var composeView: ComposeView? = null
private val dismissTarget = DismissTargetView(service)
private val vignetteOverlay = VignetteOverlay(service)
private val savedStateOwner = object : SavedStateRegistryOwner {
private val controller = SavedStateRegistryController.create(this)
@@ -28,7 +35,42 @@ class AgentOverlay(private val service: LifecycleService) {
init { controller.performRestore(null) }
}
private val layoutParams = WindowManager.LayoutParams(
// ── State ───────────────────────────────────────────────
var mode = mutableStateOf(OverlayMode.Idle)
private set
var transcript = mutableStateOf("")
private set
// ── Callbacks (set by ConnectionService) ────────────────
var onVoiceSend: ((String) -> Unit)? = null
var onVoiceCancel: (() -> Unit)? = null
var onAudioChunk: ((String) -> Unit)? = null
// ── Views ───────────────────────────────────────────────
private var pillView: ComposeView? = null
private var borderView: ComposeView? = null
private var voicePanelView: ComposeView? = null
// ── Voice recorder ──────────────────────────────────────
private var voiceRecorder: VoiceRecorder? = null
// ── Command panel ───────────────────────────────────────
private val commandPanel = CommandPanelOverlay(
service = service,
onSubmitGoal = { goal ->
val intent = Intent(service, ConnectionService::class.java).apply {
action = ConnectionService.ACTION_SEND_GOAL
putExtra(ConnectionService.EXTRA_GOAL, goal)
}
service.startService(intent)
},
onStartVoice = { startListening() },
onDismiss = { show() }
)
// ── Layout params ───────────────────────────────────────
private val pillParams = WindowManager.LayoutParams(
WindowManager.LayoutParams.WRAP_CONTENT,
WindowManager.LayoutParams.WRAP_CONTENT,
WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY,
@@ -40,8 +82,110 @@ class AgentOverlay(private val service: LifecycleService) {
y = 200
}
private val borderParams = WindowManager.LayoutParams(
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY,
WindowManager.LayoutParams.FLAG_NOT_TOUCHABLE or
WindowManager.LayoutParams.FLAG_NOT_FOCUSABLE or
WindowManager.LayoutParams.FLAG_LAYOUT_IN_SCREEN,
PixelFormat.TRANSLUCENT
)
private val voicePanelParams = WindowManager.LayoutParams(
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY,
WindowManager.LayoutParams.FLAG_NOT_FOCUSABLE or
WindowManager.LayoutParams.FLAG_LAYOUT_IN_SCREEN,
PixelFormat.TRANSLUCENT
)
// ── Public API ──────────────────────────────────────────
fun show() {
if (composeView != null) return
if (pillView != null) return
showPill()
}
fun hide() {
hidePill()
hideVoiceOverlay()
dismissTarget.hide()
}
fun destroy() {
hide()
commandPanel.destroy()
vignetteOverlay.destroy()
voiceRecorder?.stop()
voiceRecorder = null
}
fun showVignette() = vignetteOverlay.show()
fun hideVignette() = vignetteOverlay.hide()
fun startListening() {
val recorder = VoiceRecorder(
scope = service.lifecycleScope,
onChunk = { base64 -> onAudioChunk?.invoke(base64) }
)
if (!recorder.hasPermission(service)) {
val intent = Intent(service, MainActivity::class.java).apply {
flags = Intent.FLAG_ACTIVITY_NEW_TASK or
Intent.FLAG_ACTIVITY_SINGLE_TOP
putExtra("request_audio_permission", true)
}
service.startActivity(intent)
return
}
mode.value = OverlayMode.Listening
transcript.value = ""
hidePill()
showVoiceOverlay()
voiceRecorder = recorder
voiceRecorder?.start()
}
fun sendVoice() {
voiceRecorder?.stop()
voiceRecorder = null
mode.value = OverlayMode.Executing
hideVoiceOverlay()
showPill()
onVoiceSend?.invoke(transcript.value)
}
fun cancelVoice() {
voiceRecorder?.stop()
voiceRecorder = null
mode.value = OverlayMode.Idle
hideVoiceOverlay()
showPill()
onVoiceCancel?.invoke()
}
fun updateTranscript(text: String) {
transcript.value = text
}
fun returnToIdle() {
mode.value = OverlayMode.Idle
}
fun showCommandPanel() {
hide()
commandPanel.show()
}
// ── Private: Pill overlay ───────────────────────────────
private fun showPill() {
if (pillView != null) return
val view = ComposeView(service).apply {
importantForAccessibility = View.IMPORTANT_FOR_ACCESSIBILITY_NO_HIDE_DESCENDANTS
@@ -51,20 +195,57 @@ class AgentOverlay(private val service: LifecycleService) {
setupDrag(this)
}
composeView = view
windowManager.addView(view, layoutParams)
pillView = view
windowManager.addView(view, pillParams)
}
fun hide() {
composeView?.let {
windowManager.removeView(it)
}
composeView = null
private fun hidePill() {
pillView?.let { windowManager.removeView(it) }
pillView = null
}
fun destroy() {
hide()
// ── Private: Voice overlay (border + panel) ─────────────
private fun showVoiceOverlay() {
if (borderView != null) return
val border = ComposeView(service).apply {
importantForAccessibility = View.IMPORTANT_FOR_ACCESSIBILITY_NO_HIDE_DESCENDANTS
setViewTreeLifecycleOwner(service)
setViewTreeSavedStateRegistryOwner(savedStateOwner)
setContent {
DroidClawTheme { GradientBorder() }
}
}
borderView = border
windowManager.addView(border, borderParams)
val panel = ComposeView(service).apply {
importantForAccessibility = View.IMPORTANT_FOR_ACCESSIBILITY_NO_HIDE_DESCENDANTS
setViewTreeLifecycleOwner(service)
setViewTreeSavedStateRegistryOwner(savedStateOwner)
setContent {
DroidClawTheme {
VoiceOverlayContent(
transcript = transcript.value,
onSend = { sendVoice() },
onCancel = { cancelVoice() }
)
}
}
}
voicePanelView = panel
windowManager.addView(panel, voicePanelParams)
}
private fun hideVoiceOverlay() {
borderView?.let { windowManager.removeView(it) }
borderView = null
voicePanelView?.let { windowManager.removeView(it) }
voicePanelView = null
}
// ── Private: Drag handling for pill ─────────────────────
private fun setupDrag(view: View) {
var initialX = 0
@@ -76,8 +257,8 @@ class AgentOverlay(private val service: LifecycleService) {
view.setOnTouchListener { _, event ->
when (event.action) {
MotionEvent.ACTION_DOWN -> {
initialX = layoutParams.x
initialY = layoutParams.y
initialX = pillParams.x
initialY = pillParams.y
initialTouchX = event.rawX
initialTouchY = event.rawY
isDragging = false
@@ -86,21 +267,37 @@ class AgentOverlay(private val service: LifecycleService) {
MotionEvent.ACTION_MOVE -> {
val dx = (event.rawX - initialTouchX).toInt()
val dy = (event.rawY - initialTouchY).toInt()
if (Math.abs(dx) > 10 || Math.abs(dy) > 10) isDragging = true
layoutParams.x = initialX + dx
layoutParams.y = initialY + dy
windowManager.updateViewLayout(view, layoutParams)
if (!isDragging && (Math.abs(dx) > 10 || Math.abs(dy) > 10)) {
isDragging = true
dismissTarget.show()
}
if (isDragging) {
pillParams.x = initialX + dx
pillParams.y = initialY + dy
windowManager.updateViewLayout(view, pillParams)
}
true
}
MotionEvent.ACTION_UP -> {
if (!isDragging) {
val intent = Intent(service, MainActivity::class.java).apply {
flags = Intent.FLAG_ACTIVITY_NEW_TASK or
Intent.FLAG_ACTIVITY_SINGLE_TOP or
Intent.FLAG_ACTIVITY_REORDER_TO_FRONT
if (isDragging) {
val dismissed = dismissTarget.isOverTarget(event.rawX, event.rawY)
dismissTarget.hide()
if (dismissed) {
// Reset position to default so next show() starts clean
pillParams.x = 0
pillParams.y = 200
hide()
}
service.startActivity(intent)
} else {
// Tap: if running, stop goal; otherwise show command panel
if (ConnectionService.currentGoalStatus.value == GoalStatus.Running) {
ConnectionService.instance?.stopGoal()
} else {
hide()
commandPanel.show()
}
}
isDragging = false
true
}
else -> false

View File

@@ -0,0 +1,349 @@
package com.thisux.droidclaw.overlay
import android.graphics.PixelFormat
import android.view.View
import android.view.WindowManager
import androidx.compose.foundation.background
import androidx.compose.foundation.clickable
import androidx.compose.foundation.interaction.MutableInteractionSource
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.WindowInsets
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.imePadding
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.width
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.automirrored.filled.Send
import androidx.compose.material.icons.filled.Mic
import androidx.compose.material3.Card
import androidx.compose.material3.CardDefaults
import androidx.compose.material3.Icon
import androidx.compose.material3.IconButton
import androidx.compose.material3.IconButtonDefaults
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Surface
import androidx.compose.material3.Text
import androidx.compose.material3.TextField
import androidx.compose.material3.TextFieldDefaults
import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.collectAsState
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableStateOf
import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.platform.ComposeView
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.text.style.TextOverflow
import androidx.compose.ui.unit.dp
import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleService
import androidx.lifecycle.setViewTreeLifecycleOwner
import androidx.savedstate.SavedStateRegistry
import androidx.savedstate.SavedStateRegistryController
import androidx.savedstate.SavedStateRegistryOwner
import androidx.savedstate.setViewTreeSavedStateRegistryOwner
import com.thisux.droidclaw.DroidClawApp
import com.thisux.droidclaw.connection.ConnectionService
import com.thisux.droidclaw.model.ConnectionState
import com.thisux.droidclaw.model.GoalStatus
import com.thisux.droidclaw.ui.theme.DroidClawTheme
class CommandPanelOverlay(
private val service: LifecycleService,
private val onSubmitGoal: (String) -> Unit,
private val onStartVoice: () -> Unit,
private val onDismiss: () -> Unit
) {
private val windowManager = service.getSystemService(WindowManager::class.java)
private var composeView: ComposeView? = null
private val savedStateOwner = object : SavedStateRegistryOwner {
private val controller = SavedStateRegistryController.create(this)
override val lifecycle: Lifecycle get() = service.lifecycle
override val savedStateRegistry: SavedStateRegistry get() = controller.savedStateRegistry
init { controller.performRestore(null) }
}
private val layoutParams = WindowManager.LayoutParams(
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY,
WindowManager.LayoutParams.FLAG_LAYOUT_IN_SCREEN,
PixelFormat.TRANSLUCENT
).apply {
softInputMode = WindowManager.LayoutParams.SOFT_INPUT_ADJUST_RESIZE
}
fun show() {
if (composeView != null) return
val view = ComposeView(service).apply {
importantForAccessibility = View.IMPORTANT_FOR_ACCESSIBILITY_NO_HIDE_DESCENDANTS
setViewTreeLifecycleOwner(service)
setViewTreeSavedStateRegistryOwner(savedStateOwner)
setContent {
CommandPanelContent(
onSubmitGoal = { goal ->
hide()
onSubmitGoal(goal)
onDismiss()
},
onStartVoice = {
hide()
onStartVoice()
},
onDismiss = {
hide()
onDismiss()
}
)
}
}
windowManager.addView(view, layoutParams)
composeView = view
}
fun hide() {
composeView?.let { windowManager.removeView(it) }
composeView = null
}
fun isShowing() = composeView != null
fun destroy() = hide()
}
private val DEFAULT_SUGGESTIONS = listOf(
"Open WhatsApp and reply to the last message",
"Take a screenshot and save it",
"Turn on Do Not Disturb",
"Search for nearby restaurants on Maps"
)
@Composable
private fun CommandPanelContent(
onSubmitGoal: (String) -> Unit,
onStartVoice: () -> Unit,
onDismiss: () -> Unit
) {
DroidClawTheme {
val context = LocalContext.current
val app = context.applicationContext as DroidClawApp
val recentGoals by app.settingsStore.recentGoals.collectAsState(initial = emptyList())
val connectionState by ConnectionService.connectionState.collectAsState()
val goalStatus by ConnectionService.currentGoalStatus.collectAsState()
val isConnected = connectionState == ConnectionState.Connected
val canSend = isConnected && goalStatus != GoalStatus.Running
var goalInput by remember { mutableStateOf("") }
// Auto-dismiss if a goal starts running
LaunchedEffect(goalStatus) {
if (goalStatus == GoalStatus.Running) {
onDismiss()
}
}
// Build suggestion list: recent goals first, fill remaining with defaults
val suggestions = remember(recentGoals) {
val combined = mutableListOf<String>()
combined.addAll(recentGoals.take(4))
for (default in DEFAULT_SUGGESTIONS) {
if (combined.size >= 4) break
if (default !in combined) combined.add(default)
}
combined.take(4)
}
Box(modifier = Modifier.fillMaxSize()) {
// Scrim - tap to dismiss
Box(
modifier = Modifier
.fillMaxSize()
.background(Color.Black.copy(alpha = 0.6f))
.clickable(
indication = null,
interactionSource = remember { MutableInteractionSource() }
) { onDismiss() }
)
// Bottom card
Surface(
modifier = Modifier
.fillMaxWidth()
.align(Alignment.BottomCenter)
.imePadding()
.clickable(
indication = null,
interactionSource = remember { MutableInteractionSource() }
) { /* consume clicks so they don't reach scrim */ },
shape = RoundedCornerShape(topStart = 24.dp, topEnd = 24.dp),
color = MaterialTheme.colorScheme.surface,
tonalElevation = 3.dp
) {
Column(
modifier = Modifier
.fillMaxWidth()
.padding(horizontal = 20.dp, vertical = 16.dp),
verticalArrangement = Arrangement.spacedBy(16.dp)
) {
// Handle bar
Box(
modifier = Modifier
.width(40.dp)
.height(4.dp)
.clip(RoundedCornerShape(2.dp))
.background(
MaterialTheme.colorScheme.onSurfaceVariant.copy(alpha = 0.3f)
)
.align(Alignment.CenterHorizontally)
)
Text(
text = "What can I help with?",
style = MaterialTheme.typography.titleLarge,
color = MaterialTheme.colorScheme.onSurface
)
// 2x2 suggestion grid
Column(verticalArrangement = Arrangement.spacedBy(8.dp)) {
for (row in suggestions.chunked(2)) {
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
for (suggestion in row) {
SuggestionCard(
text = suggestion,
enabled = canSend,
onClick = { onSubmitGoal(suggestion) },
modifier = Modifier.weight(1f)
)
}
if (row.size < 2) {
Spacer(modifier = Modifier.weight(1f))
}
}
}
}
// Text input
Row(
modifier = Modifier.fillMaxWidth(),
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
val sendEnabled = canSend && goalInput.isNotBlank()
TextField(
value = goalInput,
onValueChange = { goalInput = it },
placeholder = {
Text(
if (!isConnected) "Not connected"
else "Enter a goal...",
style = MaterialTheme.typography.bodyMedium
)
},
modifier = Modifier.weight(1f),
enabled = canSend,
singleLine = true,
shape = RoundedCornerShape(24.dp),
colors = TextFieldDefaults.colors(
focusedContainerColor = MaterialTheme.colorScheme.surfaceVariant.copy(alpha = 0.3f),
unfocusedContainerColor = MaterialTheme.colorScheme.surfaceVariant.copy(alpha = 0.2f),
disabledContainerColor = MaterialTheme.colorScheme.surfaceVariant.copy(alpha = 0.1f),
focusedIndicatorColor = Color.Transparent,
unfocusedIndicatorColor = Color.Transparent,
disabledIndicatorColor = Color.Transparent
)
)
IconButton(
onClick = { onStartVoice() },
enabled = canSend,
colors = IconButtonDefaults.iconButtonColors(
containerColor = if (canSend)
MaterialTheme.colorScheme.secondaryContainer
else Color.Transparent
)
) {
Icon(
Icons.Filled.Mic,
contentDescription = "Voice",
tint = if (canSend)
MaterialTheme.colorScheme.onSecondaryContainer
else MaterialTheme.colorScheme.onSurfaceVariant.copy(alpha = 0.3f)
)
}
IconButton(
onClick = {
if (goalInput.isNotBlank()) onSubmitGoal(goalInput)
},
enabled = sendEnabled,
colors = IconButtonDefaults.iconButtonColors(
containerColor = if (sendEnabled)
MaterialTheme.colorScheme.primary
else Color.Transparent
)
) {
Icon(
Icons.AutoMirrored.Filled.Send,
contentDescription = "Send",
tint = if (sendEnabled)
MaterialTheme.colorScheme.onPrimary
else MaterialTheme.colorScheme.onSurfaceVariant.copy(alpha = 0.3f)
)
}
}
}
}
}
}
}
@Composable
private fun SuggestionCard(
text: String,
enabled: Boolean,
onClick: () -> Unit,
modifier: Modifier = Modifier
) {
Card(
onClick = onClick,
modifier = modifier.height(72.dp),
enabled = enabled,
shape = RoundedCornerShape(16.dp),
colors = CardDefaults.cardColors(
containerColor = MaterialTheme.colorScheme.surfaceVariant.copy(alpha = 0.4f)
)
) {
Box(
modifier = Modifier
.fillMaxSize()
.padding(12.dp),
contentAlignment = Alignment.CenterStart
) {
Text(
text = text,
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurface,
maxLines = 3,
overflow = TextOverflow.Ellipsis
)
}
}
}

View File

@@ -0,0 +1,121 @@
package com.thisux.droidclaw.overlay
import android.graphics.PixelFormat
import android.view.Gravity
import android.view.View
import android.view.WindowManager
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Close
import androidx.compose.material3.Icon
import androidx.compose.runtime.Composable
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.platform.ComposeView
import androidx.compose.ui.unit.dp
import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleService
import androidx.lifecycle.setViewTreeLifecycleOwner
import androidx.savedstate.SavedStateRegistry
import androidx.savedstate.SavedStateRegistryController
import androidx.savedstate.SavedStateRegistryOwner
import androidx.savedstate.setViewTreeSavedStateRegistryOwner
class DismissTargetView(private val service: LifecycleService) {
private val windowManager = service.getSystemService(WindowManager::class.java)
private var composeView: ComposeView? = null
private val density = service.resources.displayMetrics.density
private var targetCenterX = 0f
private var targetCenterY = 0f
private var targetRadiusPx = 36f * density
private val savedStateOwner = object : SavedStateRegistryOwner {
private val controller = SavedStateRegistryController.create(this)
override val lifecycle: Lifecycle get() = service.lifecycle
override val savedStateRegistry: SavedStateRegistry get() = controller.savedStateRegistry
init { controller.performRestore(null) }
}
private val layoutParams = WindowManager.LayoutParams(
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.WRAP_CONTENT,
WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY,
WindowManager.LayoutParams.FLAG_NOT_FOCUSABLE or
WindowManager.LayoutParams.FLAG_NOT_TOUCHABLE or
WindowManager.LayoutParams.FLAG_LAYOUT_IN_SCREEN,
PixelFormat.TRANSLUCENT
).apply {
gravity = Gravity.BOTTOM or Gravity.CENTER_HORIZONTAL
}
fun show() {
if (composeView != null) return
// Compute target coordinates synchronously before showing the view
val metrics = windowManager.currentWindowMetrics
val screenWidth = metrics.bounds.width().toFloat()
val screenHeight = metrics.bounds.height().toFloat()
targetCenterX = screenWidth / 2f
// The circle is 56dp from bottom edge + 36dp (half of 72dp circle)
targetCenterY = screenHeight - (56f + 36f) * density
val view = ComposeView(service).apply {
importantForAccessibility = View.IMPORTANT_FOR_ACCESSIBILITY_NO_HIDE_DESCENDANTS
setViewTreeLifecycleOwner(service)
setViewTreeSavedStateRegistryOwner(savedStateOwner)
setContent { DismissTargetContent() }
}
composeView = view
windowManager.addView(view, layoutParams)
}
fun hide() {
composeView?.let { windowManager.removeView(it) }
composeView = null
}
fun destroy() = hide()
fun isOverTarget(rawX: Float, rawY: Float): Boolean {
if (composeView == null) return false
val dx = rawX - targetCenterX
val dy = rawY - targetCenterY
// Use generous hit radius (1.5x visual radius) for easier targeting
val hitRadius = targetRadiusPx * 1.5f
return (dx * dx + dy * dy) <= (hitRadius * hitRadius)
}
}
@Composable
private fun DismissTargetContent() {
Box(
modifier = Modifier
.fillMaxWidth()
.padding(bottom = 56.dp),
contentAlignment = Alignment.BottomCenter
) {
Box(
modifier = Modifier
.size(72.dp)
.clip(CircleShape)
.background(Color(0xCC333333)),
contentAlignment = Alignment.Center
) {
Icon(
imageVector = Icons.Default.Close,
contentDescription = "Dismiss",
tint = Color.White,
modifier = Modifier.size(28.dp)
)
}
}
}

View File

@@ -0,0 +1,85 @@
package com.thisux.droidclaw.overlay
import androidx.compose.animation.core.LinearEasing
import androidx.compose.animation.core.RepeatMode
import androidx.compose.animation.core.animateFloat
import androidx.compose.animation.core.infiniteRepeatable
import androidx.compose.animation.core.rememberInfiniteTransition
import androidx.compose.animation.core.tween
import androidx.compose.foundation.Canvas
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.runtime.Composable
import androidx.compose.runtime.getValue
import androidx.compose.ui.Modifier
import androidx.compose.ui.geometry.Offset
import androidx.compose.ui.geometry.Size
import androidx.compose.ui.graphics.Brush
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.platform.LocalDensity
import androidx.compose.ui.unit.dp
private val GradientColors = listOf(
Color(0xFFC62828), // crimson red
Color(0xFFEF5350), // crimson light
Color(0xFFFFB300), // golden accent
Color(0xFFEF5350), // crimson light
Color(0xFFC62828), // crimson red (loop)
)
@Composable
fun GradientBorder() {
val transition = rememberInfiniteTransition(label = "gradientRotation")
val offset by transition.animateFloat(
initialValue = 0f,
targetValue = 1f,
animationSpec = infiniteRepeatable(
animation = tween(durationMillis = 3000, easing = LinearEasing),
repeatMode = RepeatMode.Restart
),
label = "gradientOffset"
)
val borderWidth = with(LocalDensity.current) { 4.dp.toPx() }
Canvas(modifier = Modifier.fillMaxSize()) {
val w = size.width
val h = size.height
val shiftedColors = shiftColors(GradientColors, offset)
// Top edge
drawRect(
brush = Brush.horizontalGradient(shiftedColors),
topLeft = Offset.Zero,
size = Size(w, borderWidth)
)
// Bottom edge
drawRect(
brush = Brush.horizontalGradient(shiftedColors.reversed()),
topLeft = Offset(0f, h - borderWidth),
size = Size(w, borderWidth)
)
// Left edge
drawRect(
brush = Brush.verticalGradient(shiftedColors),
topLeft = Offset.Zero,
size = Size(borderWidth, h)
)
// Right edge
drawRect(
brush = Brush.verticalGradient(shiftedColors.reversed()),
topLeft = Offset(w - borderWidth, 0f),
size = Size(borderWidth, h)
)
}
}
private fun shiftColors(colors: List<Color>, offset: Float): List<Color> {
if (colors.size < 2) return colors
val n = colors.size
val shift = (offset * n).toInt() % n
return colors.subList(shift, n) + colors.subList(0, shift)
}

View File

@@ -2,28 +2,16 @@ package com.thisux.droidclaw.overlay
import androidx.compose.animation.animateColorAsState
import androidx.compose.animation.core.LinearEasing
import androidx.compose.animation.core.RepeatMode
import androidx.compose.animation.core.animateFloat
import androidx.compose.animation.core.infiniteRepeatable
import androidx.compose.animation.core.rememberInfiniteTransition
import androidx.compose.animation.core.tween
import androidx.compose.foundation.Image
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.widthIn
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material3.Icon
import androidx.compose.material3.IconButton
import androidx.compose.material3.IconButtonDefaults
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Text
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Close
import androidx.compose.material3.CircularProgressIndicator
import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.collectAsState
@@ -33,12 +21,12 @@ import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.alpha
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.text.style.TextOverflow
import androidx.compose.ui.graphics.StrokeCap
import androidx.compose.ui.res.painterResource
import androidx.compose.ui.unit.dp
import androidx.compose.ui.unit.sp
import com.thisux.droidclaw.R
import com.thisux.droidclaw.connection.ConnectionService
import com.thisux.droidclaw.model.ConnectionState
import com.thisux.droidclaw.model.GoalStatus
@@ -49,16 +37,14 @@ private val Green = Color(0xFF4CAF50)
private val Blue = Color(0xFF2196F3)
private val Red = Color(0xFFF44336)
private val Gray = Color(0xFF9E9E9E)
private val PillBackground = Color(0xE6212121)
private val IconBackground = Color(0xFF1A1A1A)
@Composable
fun OverlayContent() {
DroidClawTheme {
val connectionState by ConnectionService.connectionState.collectAsState()
val goalStatus by ConnectionService.currentGoalStatus.collectAsState()
val steps by ConnectionService.currentSteps.collectAsState()
// Auto-reset Completed/Failed back to Idle after 3s
var displayStatus by remember { mutableStateOf(goalStatus) }
LaunchedEffect(goalStatus) {
displayStatus = goalStatus
@@ -70,102 +56,67 @@ fun OverlayContent() {
val isConnected = connectionState == ConnectionState.Connected
val dotColor by animateColorAsState(
val ringColor by animateColorAsState(
targetValue = when {
!isConnected -> Gray
displayStatus == GoalStatus.Running -> Blue
displayStatus == GoalStatus.Failed -> Red
displayStatus == GoalStatus.Running -> Red
displayStatus == GoalStatus.Completed -> Blue
displayStatus == GoalStatus.Failed -> Gray
else -> Green
},
label = "dotColor"
label = "ringColor"
)
val statusText = when {
!isConnected -> "Offline"
displayStatus == GoalStatus.Running -> {
val last = steps.lastOrNull()
if (last != null) {
val label = last.reasoning.ifBlank {
// Extract just the action name from the JSON string
Regex("""action[=:]?\s*(\w+)""").find(last.action)?.groupValues?.get(1) ?: "working"
}
"${last.step}: $label"
} else "Running..."
}
displayStatus == GoalStatus.Completed -> "Done"
displayStatus == GoalStatus.Failed -> "Stopped"
else -> "Ready"
}
val isRunning = isConnected && displayStatus == GoalStatus.Running
Row(
modifier = Modifier
.clip(RoundedCornerShape(24.dp))
.background(PillBackground)
.height(48.dp)
.widthIn(min = 100.dp, max = 220.dp)
.padding(horizontal = 12.dp),
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.spacedBy(8.dp)
Box(
contentAlignment = Alignment.Center,
modifier = Modifier.size(52.dp)
) {
StatusDot(
color = dotColor,
pulse = isConnected && displayStatus == GoalStatus.Running
)
Text(
text = statusText,
color = Color.White,
fontSize = 13.sp,
maxLines = 1,
overflow = TextOverflow.Ellipsis,
modifier = Modifier.weight(1f, fill = false)
)
if (isConnected && displayStatus == GoalStatus.Running) {
IconButton(
onClick = { ConnectionService.instance?.stopGoal() },
modifier = Modifier.size(28.dp),
colors = IconButtonDefaults.iconButtonColors(
contentColor = Color.White.copy(alpha = 0.8f)
)
) {
Icon(
imageVector = Icons.Default.Close,
contentDescription = "Stop goal",
modifier = Modifier.size(16.dp)
)
}
}
}
}
}
@Composable
private fun StatusDot(color: Color, pulse: Boolean) {
if (pulse) {
val transition = rememberInfiniteTransition(label = "pulse")
val alpha by transition.animateFloat(
initialValue = 1f,
targetValue = 0.3f,
animationSpec = infiniteRepeatable(
animation = tween(800, easing = LinearEasing),
repeatMode = RepeatMode.Reverse
),
label = "pulseAlpha"
)
// Background circle
Box(
modifier = Modifier
.size(10.dp)
.alpha(alpha)
.size(52.dp)
.clip(CircleShape)
.background(color)
.background(IconBackground)
)
if (isRunning) {
// Spinning progress ring
val transition = rememberInfiniteTransition(label = "spin")
val rotation by transition.animateFloat(
initialValue = 0f,
targetValue = 360f,
animationSpec = infiniteRepeatable(
animation = tween(1200, easing = LinearEasing)
),
label = "rotation"
)
CircularProgressIndicator(
modifier = Modifier.size(52.dp),
color = ringColor,
strokeWidth = 3.dp,
strokeCap = StrokeCap.Round
)
} else {
Box(
modifier = Modifier
.size(10.dp)
.clip(CircleShape)
.background(color)
// Static colored ring
CircularProgressIndicator(
progress = { 1f },
modifier = Modifier.size(52.dp),
color = ringColor,
strokeWidth = 3.dp,
strokeCap = StrokeCap.Round
)
}
// App icon
Image(
painter = painterResource(R.drawable.ic_launcher_foreground),
contentDescription = "DroidClaw",
modifier = Modifier
.size(40.dp)
.clip(CircleShape)
)
}
}
}

View File

@@ -0,0 +1,128 @@
package com.thisux.droidclaw.overlay
import android.graphics.PixelFormat
import android.view.View
import android.view.WindowManager
import androidx.compose.animation.core.FastOutSlowInEasing
import androidx.compose.animation.core.RepeatMode
import androidx.compose.animation.core.animateFloat
import androidx.compose.animation.core.infiniteRepeatable
import androidx.compose.animation.core.rememberInfiniteTransition
import androidx.compose.animation.core.tween
import androidx.compose.foundation.Canvas
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.runtime.Composable
import androidx.compose.runtime.getValue
import androidx.compose.ui.Modifier
import androidx.compose.ui.graphics.Brush
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.graphics.drawscope.DrawScope
import androidx.compose.ui.platform.ComposeView
import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleService
import androidx.lifecycle.setViewTreeLifecycleOwner
import androidx.savedstate.SavedStateRegistry
import androidx.savedstate.SavedStateRegistryController
import androidx.savedstate.SavedStateRegistryOwner
import androidx.savedstate.setViewTreeSavedStateRegistryOwner
private val CrimsonGlow = Color(0xFFC62828)
class VignetteOverlay(private val service: LifecycleService) {
private val windowManager = service.getSystemService(WindowManager::class.java)
private var composeView: ComposeView? = null
private val savedStateOwner = object : SavedStateRegistryOwner {
private val controller = SavedStateRegistryController.create(this)
override val lifecycle: Lifecycle get() = service.lifecycle
override val savedStateRegistry: SavedStateRegistry get() = controller.savedStateRegistry
init { controller.performRestore(null) }
}
private val layoutParams = WindowManager.LayoutParams(
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.MATCH_PARENT,
WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY,
WindowManager.LayoutParams.FLAG_NOT_FOCUSABLE or
WindowManager.LayoutParams.FLAG_NOT_TOUCHABLE or
WindowManager.LayoutParams.FLAG_NOT_TOUCH_MODAL or
WindowManager.LayoutParams.FLAG_LAYOUT_IN_SCREEN,
PixelFormat.TRANSLUCENT
)
fun show() {
if (composeView != null) return
val view = ComposeView(service).apply {
importantForAccessibility = View.IMPORTANT_FOR_ACCESSIBILITY_NO_HIDE_DESCENDANTS
setViewTreeLifecycleOwner(service)
setViewTreeSavedStateRegistryOwner(savedStateOwner)
setContent { VignetteContent() }
}
composeView = view
windowManager.addView(view, layoutParams)
}
fun hide() {
composeView?.let { windowManager.removeView(it) }
composeView = null
}
fun destroy() = hide()
}
@Composable
private fun VignetteContent() {
val transition = rememberInfiniteTransition(label = "vignettePulse")
val alpha by transition.animateFloat(
initialValue = 0.5f,
targetValue = 1.0f,
animationSpec = infiniteRepeatable(
animation = tween(2200, easing = FastOutSlowInEasing),
repeatMode = RepeatMode.Reverse
),
label = "vignetteAlpha"
)
Canvas(modifier = Modifier.fillMaxSize()) {
drawVignette(alpha)
}
}
private fun DrawScope.drawVignette(alpha: Float) {
val edgeColor = CrimsonGlow.copy(alpha = 0.4f * alpha)
val glowWidth = size.minDimension * 0.35f
// Top edge
drawRect(
brush = Brush.verticalGradient(
colors = listOf(edgeColor, Color.Transparent),
startY = 0f,
endY = glowWidth
)
)
// Bottom edge
drawRect(
brush = Brush.verticalGradient(
colors = listOf(Color.Transparent, edgeColor),
startY = size.height - glowWidth,
endY = size.height
)
)
// Left edge
drawRect(
brush = Brush.horizontalGradient(
colors = listOf(edgeColor, Color.Transparent),
startX = 0f,
endX = glowWidth
)
)
// Right edge
drawRect(
brush = Brush.horizontalGradient(
colors = listOf(Color.Transparent, edgeColor),
startX = size.width - glowWidth,
endX = size.width
)
)
}

View File

@@ -0,0 +1,158 @@
package com.thisux.droidclaw.overlay
import androidx.compose.animation.core.LinearEasing
import androidx.compose.animation.core.RepeatMode
import androidx.compose.animation.core.animateFloat
import androidx.compose.animation.core.infiniteRepeatable
import androidx.compose.animation.core.rememberInfiniteTransition
import androidx.compose.animation.core.tween
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.rememberScrollState
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.foundation.verticalScroll
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Close
import androidx.compose.material.icons.filled.Send
import androidx.compose.material3.Button
import androidx.compose.material3.ButtonDefaults
import androidx.compose.material3.Icon
import androidx.compose.material3.OutlinedButton
import androidx.compose.material3.Text
import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.getValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.alpha
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.text.font.FontWeight
import androidx.compose.ui.text.style.TextAlign
import androidx.compose.ui.unit.dp
import androidx.compose.ui.unit.sp
private val AccentCrimson = Color(0xFFC62828)
private val PanelBackground = Color(0xCC1A1A1A)
@Composable
fun VoiceOverlayContent(
transcript: String,
onSend: () -> Unit,
onCancel: () -> Unit
) {
val scrollState = rememberScrollState()
LaunchedEffect(transcript) {
scrollState.animateScrollTo(scrollState.maxValue)
}
Box(
modifier = Modifier.fillMaxSize(),
contentAlignment = Alignment.BottomCenter
) {
Column(
modifier = Modifier
.fillMaxWidth()
.clip(RoundedCornerShape(topStart = 24.dp, topEnd = 24.dp))
.background(PanelBackground)
.padding(24.dp),
horizontalAlignment = Alignment.CenterHorizontally
) {
if (transcript.isEmpty()) {
ListeningIndicator()
Spacer(modifier = Modifier.height(16.dp))
Text(
text = "Listening...",
color = Color.White.copy(alpha = 0.6f),
fontSize = 16.sp
)
} else {
Text(
text = transcript,
color = Color.White,
fontSize = 24.sp,
fontWeight = FontWeight.Medium,
textAlign = TextAlign.Center,
lineHeight = 32.sp,
modifier = Modifier
.fillMaxWidth()
.height(160.dp)
.verticalScroll(scrollState)
.padding(horizontal = 8.dp)
)
}
Spacer(modifier = Modifier.height(24.dp))
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(16.dp, Alignment.CenterHorizontally)
) {
OutlinedButton(
onClick = onCancel,
colors = ButtonDefaults.outlinedButtonColors(
contentColor = Color.White.copy(alpha = 0.7f)
),
modifier = Modifier.weight(1f)
) {
Icon(
imageVector = Icons.Default.Close,
contentDescription = null,
modifier = Modifier.size(18.dp)
)
Text(" Cancel", fontSize = 15.sp)
}
Button(
onClick = onSend,
enabled = transcript.isNotEmpty(),
colors = ButtonDefaults.buttonColors(
containerColor = AccentCrimson,
contentColor = Color.White
),
modifier = Modifier.weight(1f)
) {
Icon(
imageVector = Icons.Default.Send,
contentDescription = null,
modifier = Modifier.size(18.dp)
)
Text(" Send", fontSize = 15.sp)
}
}
}
}
}
@Composable
private fun ListeningIndicator() {
val transition = rememberInfiniteTransition(label = "listening")
val alpha by transition.animateFloat(
initialValue = 0.3f,
targetValue = 1f,
animationSpec = infiniteRepeatable(
animation = tween(800, easing = LinearEasing),
repeatMode = RepeatMode.Reverse
),
label = "pulseAlpha"
)
Box(
modifier = Modifier
.size(48.dp)
.alpha(alpha)
.clip(CircleShape)
.background(AccentCrimson)
)
}

View File

@@ -0,0 +1,109 @@
package com.thisux.droidclaw.overlay
import android.Manifest
import android.content.Context
import android.content.pm.PackageManager
import android.media.AudioFormat
import android.media.AudioRecord
import android.media.MediaRecorder
import android.util.Base64
import android.util.Log
import androidx.core.content.ContextCompat
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.Job
import kotlinx.coroutines.isActive
import kotlinx.coroutines.launch
/**
* Captures audio from the microphone and streams base64-encoded PCM chunks.
*
* Audio format: 16kHz, mono, 16-bit PCM (linear16).
* Chunks are emitted every ~100ms via the [onChunk] callback.
*/
class VoiceRecorder(
private val scope: CoroutineScope,
private val onChunk: (base64: String) -> Unit
) {
companion object {
private const val TAG = "VoiceRecorder"
private const val SAMPLE_RATE = 16000
private const val CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO
private const val AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT
private const val CHUNK_SIZE = 3200 // ~100ms at 16kHz mono 16-bit
}
private var audioRecord: AudioRecord? = null
private var recordingJob: Job? = null
val isRecording: Boolean get() = recordingJob?.isActive == true
fun hasPermission(context: Context): Boolean {
return ContextCompat.checkSelfPermission(
context, Manifest.permission.RECORD_AUDIO
) == PackageManager.PERMISSION_GRANTED
}
fun start(): Boolean {
if (isRecording) return false
val bufferSize = maxOf(
AudioRecord.getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT),
CHUNK_SIZE * 2
)
val record = try {
AudioRecord(
MediaRecorder.AudioSource.MIC,
SAMPLE_RATE,
CHANNEL_CONFIG,
AUDIO_FORMAT,
bufferSize
)
} catch (e: SecurityException) {
Log.e(TAG, "Missing RECORD_AUDIO permission", e)
return false
}
if (record.state != AudioRecord.STATE_INITIALIZED) {
Log.e(TAG, "AudioRecord failed to initialize")
record.release()
return false
}
audioRecord = record
record.startRecording()
recordingJob = scope.launch(Dispatchers.IO) {
val buffer = ByteArray(CHUNK_SIZE)
while (isActive) {
val bytesRead = record.read(buffer, 0, CHUNK_SIZE)
if (bytesRead > 0) {
val base64 = Base64.encodeToString(
buffer.copyOf(bytesRead),
Base64.NO_WRAP
)
onChunk(base64)
}
}
}
Log.i(TAG, "Recording started")
return true
}
fun stop() {
recordingJob?.cancel()
recordingJob = null
audioRecord?.let {
try {
it.stop()
it.release()
} catch (e: Exception) {
Log.w(TAG, "Error stopping AudioRecord", e)
}
}
audioRecord = null
Log.i(TAG, "Recording stopped")
}
}

View File

@@ -45,16 +45,26 @@ import androidx.compose.ui.graphics.Color
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.text.style.TextOverflow
import androidx.compose.ui.unit.dp
import com.thisux.droidclaw.DroidClawApp
import com.thisux.droidclaw.connection.ConnectionService
import com.thisux.droidclaw.model.AgentStep
import com.thisux.droidclaw.model.ConnectionState
import com.thisux.droidclaw.model.GoalStatus
import com.thisux.droidclaw.ui.theme.StatusGreen
import com.thisux.droidclaw.ui.theme.StatusRed
import androidx.compose.material3.Card
import androidx.compose.material3.CardDefaults
import java.text.SimpleDateFormat
import java.util.Date
import java.util.Locale
private val DEFAULT_SUGGESTIONS = listOf(
"Open WhatsApp and reply to the last message",
"Take a screenshot and save it",
"Turn on Do Not Disturb",
"Search for nearby restaurants on Maps"
)
// Represents a message in the chat timeline
private sealed class ChatItem {
data class GoalMessage(val text: String) : ChatItem()
@@ -65,13 +75,28 @@ private sealed class ChatItem {
@Composable
fun HomeScreen() {
val context = LocalContext.current
val app = context.applicationContext as DroidClawApp
val connectionState by ConnectionService.connectionState.collectAsState()
val goalStatus by ConnectionService.currentGoalStatus.collectAsState()
val steps by ConnectionService.currentSteps.collectAsState()
val currentGoal by ConnectionService.currentGoal.collectAsState()
val recentGoals by app.settingsStore.recentGoals.collectAsState(initial = emptyList())
var goalInput by remember { mutableStateOf("") }
val isConnected = connectionState == ConnectionState.Connected
val canSend = isConnected && goalStatus != GoalStatus.Running
val suggestions = remember(recentGoals) {
val combined = mutableListOf<String>()
combined.addAll(recentGoals.take(4))
for (default in DEFAULT_SUGGESTIONS) {
if (combined.size >= 4) break
if (default !in combined) combined.add(default)
}
combined.take(4)
}
// Build chat items: goal bubble → step bubbles → status bubble
val chatItems = remember(currentGoal, steps, goalStatus) {
buildList {
@@ -105,7 +130,10 @@ fun HomeScreen() {
.fillMaxWidth(),
contentAlignment = Alignment.Center
) {
Column(horizontalAlignment = Alignment.CenterHorizontally) {
Column(
horizontalAlignment = Alignment.CenterHorizontally,
modifier = Modifier.padding(horizontal = 20.dp)
) {
Text(
text = "What should I do?",
style = MaterialTheme.typography.headlineSmall,
@@ -117,6 +145,33 @@ fun HomeScreen() {
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurfaceVariant.copy(alpha = 0.35f)
)
Spacer(modifier = Modifier.height(24.dp))
Column(verticalArrangement = Arrangement.spacedBy(8.dp)) {
for (row in suggestions.chunked(2)) {
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
for (suggestion in row) {
SuggestionCard(
text = suggestion,
enabled = canSend,
onClick = {
val intent = Intent(context, ConnectionService::class.java).apply {
action = ConnectionService.ACTION_SEND_GOAL
putExtra(ConnectionService.EXTRA_GOAL, suggestion)
}
context.startService(intent)
},
modifier = Modifier.weight(1f)
)
}
if (row.size < 2) {
Spacer(modifier = Modifier.weight(1f))
}
}
}
}
}
}
} else {
@@ -366,6 +421,39 @@ private fun InputBar(
}
}
@Composable
private fun SuggestionCard(
text: String,
enabled: Boolean,
onClick: () -> Unit,
modifier: Modifier = Modifier
) {
Card(
onClick = onClick,
modifier = modifier.height(72.dp),
enabled = enabled,
shape = RoundedCornerShape(16.dp),
colors = CardDefaults.cardColors(
containerColor = MaterialTheme.colorScheme.surfaceVariant.copy(alpha = 0.4f)
)
) {
Box(
modifier = Modifier
.fillMaxSize()
.padding(12.dp),
contentAlignment = Alignment.CenterStart
) {
Text(
text = text,
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurface,
maxLines = 3,
overflow = TextOverflow.Ellipsis
)
}
}
}
private fun formatTime(timestamp: Long): String {
val sdf = SimpleDateFormat("HH:mm", Locale.getDefault())
return sdf.format(Date(timestamp))

View File

@@ -1,10 +1,12 @@
package com.thisux.droidclaw.ui.screens
import android.app.Activity
import android.app.role.RoleManager
import android.content.Context
import android.content.Intent
import android.media.projection.MediaProjectionManager
import android.net.Uri
import android.os.Build
import android.provider.Settings
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.result.contract.ActivityResultContracts
@@ -32,6 +34,7 @@ import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.OutlinedButton
import androidx.compose.material3.OutlinedTextField
import androidx.compose.material3.Text
import androidx.compose.material3.TextButton
import androidx.compose.runtime.Composable
import androidx.compose.runtime.DisposableEffect
import androidx.compose.runtime.collectAsState
@@ -99,10 +102,22 @@ fun OnboardingScreen(onComplete: () -> Unit) {
}
)
1 -> OnboardingStepTwo(
onGetStarted = {
onContinue = { currentStep = 2 }
)
2 -> OnboardingStepAssistant(
onContinue = {
scope.launch {
app.settingsStore.setHasOnboarded(true)
val intent = Intent(context, ConnectionService::class.java).apply {
action = ConnectionService.ACTION_CONNECT
}
context.startForegroundService(intent)
onComplete()
}
},
onSkip = {
scope.launch {
app.settingsStore.setHasOnboarded(true)
// Auto-connect
val intent = Intent(context, ConnectionService::class.java).apply {
action = ConnectionService.ACTION_CONNECT
}
@@ -191,7 +206,7 @@ private fun OnboardingStepOne(
}
@Composable
private fun OnboardingStepTwo(onGetStarted: () -> Unit) {
private fun OnboardingStepTwo(onContinue: () -> Unit) {
val context = LocalContext.current
val isCaptureAvailable by ScreenCaptureManager.isAvailable.collectAsState()
@@ -312,14 +327,14 @@ private fun OnboardingStepTwo(onGetStarted: () -> Unit) {
Spacer(modifier = Modifier.height(32.dp))
Button(
onClick = onGetStarted,
onClick = onContinue,
enabled = allGranted,
modifier = Modifier
.fillMaxWidth()
.height(52.dp),
shape = RoundedCornerShape(12.dp)
) {
Text("Get Started", style = MaterialTheme.typography.labelLarge)
Text("Continue", style = MaterialTheme.typography.labelLarge)
}
if (!allGranted) {
@@ -335,6 +350,92 @@ private fun OnboardingStepTwo(onGetStarted: () -> Unit) {
}
}
@Composable
private fun OnboardingStepAssistant(
onContinue: () -> Unit,
onSkip: () -> Unit
) {
val context = LocalContext.current
var isDefaultAssistant by remember {
mutableStateOf(
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.Q) {
val rm = context.getSystemService(Context.ROLE_SERVICE) as RoleManager
rm.isRoleHeld(RoleManager.ROLE_ASSISTANT)
} else false
)
}
val lifecycleOwner = LocalLifecycleOwner.current
DisposableEffect(lifecycleOwner) {
val observer = LifecycleEventObserver { _, event ->
if (event == Lifecycle.Event.ON_RESUME) {
isDefaultAssistant = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.Q) {
val rm = context.getSystemService(Context.ROLE_SERVICE) as RoleManager
rm.isRoleHeld(RoleManager.ROLE_ASSISTANT)
} else false
}
}
lifecycleOwner.lifecycle.addObserver(observer)
onDispose { lifecycleOwner.lifecycle.removeObserver(observer) }
}
Column(
modifier = Modifier
.fillMaxSize()
.padding(horizontal = 24.dp, vertical = 48.dp)
.verticalScroll(rememberScrollState()),
horizontalAlignment = Alignment.CenterHorizontally
) {
Spacer(modifier = Modifier.height(32.dp))
Text(
text = "Digital Assistant",
style = MaterialTheme.typography.headlineMedium,
color = MaterialTheme.colorScheme.onBackground
)
Spacer(modifier = Modifier.height(8.dp))
Text(
text = "Set DroidClaw as your default digital assistant to invoke it with a long-press on the home button",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurfaceVariant,
textAlign = TextAlign.Center
)
Spacer(modifier = Modifier.height(32.dp))
OnboardingChecklistItem(
label = "Default Digital Assistant",
description = "Long-press home to open DroidClaw command panel",
isOk = isDefaultAssistant,
actionLabel = "Set",
onAction = {
context.startActivity(Intent(Settings.ACTION_VOICE_INPUT_SETTINGS))
}
)
Spacer(modifier = Modifier.height(32.dp))
Button(
onClick = onContinue,
modifier = Modifier
.fillMaxWidth()
.height(52.dp),
shape = RoundedCornerShape(12.dp)
) {
Text("Get Started", style = MaterialTheme.typography.labelLarge)
}
Spacer(modifier = Modifier.height(8.dp))
TextButton(onClick = onSkip) {
Text("Skip for now")
}
}
}
@Composable
private fun OnboardingChecklistItem(
label: String,

View File

@@ -1,10 +1,12 @@
package com.thisux.droidclaw.ui.screens
import android.app.Activity
import android.app.role.RoleManager
import android.content.Context
import android.content.Intent
import android.media.projection.MediaProjectionManager
import android.net.Uri
import android.os.Build
import android.provider.Settings
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.result.contract.ActivityResultContracts
@@ -96,6 +98,14 @@ fun SettingsScreen() {
var hasOverlayPermission by remember {
mutableStateOf(Settings.canDrawOverlays(context))
}
var isDefaultAssistant by remember {
mutableStateOf(
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.Q) {
val rm = context.getSystemService(Context.ROLE_SERVICE) as RoleManager
rm.isRoleHeld(RoleManager.ROLE_ASSISTANT)
} else false
)
}
val lifecycleOwner = LocalLifecycleOwner.current
DisposableEffect(lifecycleOwner) {
@@ -106,6 +116,10 @@ fun SettingsScreen() {
hasCaptureConsent = isCaptureAvailable || ScreenCaptureManager.hasConsent()
isBatteryExempt = BatteryOptimization.isIgnoringBatteryOptimizations(context)
hasOverlayPermission = Settings.canDrawOverlays(context)
isDefaultAssistant = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.Q) {
val rm = context.getSystemService(Context.ROLE_SERVICE) as RoleManager
rm.isRoleHeld(RoleManager.ROLE_ASSISTANT)
} else false
}
}
lifecycleOwner.lifecycle.addObserver(observer)
@@ -300,6 +314,17 @@ fun SettingsScreen() {
}
)
ChecklistItem(
label = "Default digital assistant",
isOk = isDefaultAssistant,
actionLabel = "Set",
onAction = {
context.startActivity(
Intent(Settings.ACTION_VOICE_INPUT_SETTINGS)
)
}
)
Spacer(modifier = Modifier.height(16.dp))
}
}

View File

@@ -0,0 +1,10 @@
package com.thisux.droidclaw.voice
import android.content.Intent
import android.speech.RecognitionService
class DroidClawRecognitionService : RecognitionService() {
override fun onStartListening(intent: Intent?, callback: Callback?) {}
override fun onCancel(callback: Callback?) {}
override fun onStopListening(callback: Callback?) {}
}

View File

@@ -0,0 +1,5 @@
package com.thisux.droidclaw.voice
import android.service.voice.VoiceInteractionService
class DroidClawVoiceInteractionService : VoiceInteractionService()

View File

@@ -0,0 +1,19 @@
package com.thisux.droidclaw.voice
import android.content.Context
import android.content.Intent
import android.os.Bundle
import android.service.voice.VoiceInteractionSession
import com.thisux.droidclaw.connection.ConnectionService
class DroidClawVoiceSession(context: Context) : VoiceInteractionSession(context) {
override fun onShow(args: Bundle?, showFlags: Int) {
super.onShow(args, showFlags)
val intent = Intent(context, ConnectionService::class.java).apply {
action = ConnectionService.ACTION_SHOW_COMMAND_PANEL
}
context.startService(intent)
hide()
}
}

View File

@@ -0,0 +1,11 @@
package com.thisux.droidclaw.voice
import android.os.Bundle
import android.service.voice.VoiceInteractionSession
import android.service.voice.VoiceInteractionSessionService
class DroidClawVoiceSessionService : VoiceInteractionSessionService() {
override fun onNewSession(args: Bundle?): VoiceInteractionSession {
return DroidClawVoiceSession(this)
}
}

View File

@@ -0,0 +1,158 @@
package com.thisux.droidclaw.voice
import androidx.compose.foundation.background
import androidx.compose.foundation.clickable
import androidx.compose.foundation.interaction.MutableInteractionSource
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.width
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.automirrored.filled.Send
import androidx.compose.material3.IconButton
import androidx.compose.material3.IconButtonDefaults
import androidx.compose.material3.Icon
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Surface
import androidx.compose.material3.Text
import androidx.compose.material3.TextField
import androidx.compose.material3.TextFieldDefaults
import androidx.compose.runtime.Composable
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableStateOf
import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.runtime.collectAsState
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.unit.dp
import com.thisux.droidclaw.connection.ConnectionService
import com.thisux.droidclaw.model.ConnectionState
import com.thisux.droidclaw.model.GoalStatus
@Composable
fun GoalInputSheet(
onSubmit: (String) -> Unit,
onDismiss: () -> Unit
) {
val connectionState by ConnectionService.connectionState.collectAsState()
val goalStatus by ConnectionService.currentGoalStatus.collectAsState()
val isConnected = connectionState == ConnectionState.Connected
val isRunning = goalStatus == GoalStatus.Running
val canSend = isConnected && !isRunning
var text by remember { mutableStateOf("") }
Box(
modifier = Modifier
.fillMaxSize()
.background(Color.Black.copy(alpha = 0.4f))
.clickable(
indication = null,
interactionSource = remember { MutableInteractionSource() }
) { onDismiss() }
) {
Surface(
modifier = Modifier
.fillMaxWidth()
.align(Alignment.BottomCenter)
.clickable(
indication = null,
interactionSource = remember { MutableInteractionSource() }
) { /* consume clicks so they don't dismiss */ },
shape = RoundedCornerShape(topStart = 20.dp, topEnd = 20.dp),
tonalElevation = 6.dp,
color = MaterialTheme.colorScheme.surface
) {
Column(
modifier = Modifier
.padding(horizontal = 16.dp)
.padding(top = 12.dp, bottom = 24.dp),
verticalArrangement = Arrangement.spacedBy(12.dp)
) {
// Drag handle
Box(
modifier = Modifier
.width(40.dp)
.height(4.dp)
.clip(RoundedCornerShape(2.dp))
.background(
MaterialTheme.colorScheme.onSurfaceVariant.copy(alpha = 0.3f)
)
.align(Alignment.CenterHorizontally)
)
Text(
text = "What should I do?",
style = MaterialTheme.typography.titleMedium,
color = MaterialTheme.colorScheme.onSurface
)
Row(
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
TextField(
value = text,
onValueChange = { text = it },
placeholder = {
Text(
when {
!isConnected -> "Not connected"
isRunning -> "Agent is working..."
else -> "Enter a goal..."
},
style = MaterialTheme.typography.bodyMedium
)
},
modifier = Modifier.weight(1f),
enabled = canSend,
singleLine = true,
shape = RoundedCornerShape(24.dp),
colors = TextFieldDefaults.colors(
focusedContainerColor = MaterialTheme.colorScheme.surfaceVariant
.copy(alpha = 0.3f),
unfocusedContainerColor = MaterialTheme.colorScheme.surfaceVariant
.copy(alpha = 0.2f),
disabledContainerColor = MaterialTheme.colorScheme.surfaceVariant
.copy(alpha = 0.1f),
focusedIndicatorColor = Color.Transparent,
unfocusedIndicatorColor = Color.Transparent,
disabledIndicatorColor = Color.Transparent
)
)
IconButton(
onClick = {
if (text.isNotBlank()) onSubmit(text.trim())
},
enabled = canSend && text.isNotBlank(),
colors = IconButtonDefaults.iconButtonColors(
containerColor = if (canSend && text.isNotBlank())
MaterialTheme.colorScheme.primary
else Color.Transparent
)
) {
Icon(
Icons.AutoMirrored.Filled.Send,
contentDescription = "Send goal",
tint = if (canSend && text.isNotBlank())
MaterialTheme.colorScheme.onPrimary
else
MaterialTheme.colorScheme.onSurfaceVariant.copy(alpha = 0.3f)
)
}
}
}
}
}
}

View File

@@ -0,0 +1,7 @@
<?xml version="1.0" encoding="utf-8"?>
<voice-interaction-service xmlns:android="http://schemas.android.com/apk/res/android"
android:sessionService="com.thisux.droidclaw.voice.DroidClawVoiceSessionService"
android:recognitionService="com.thisux.droidclaw.voice.DroidClawRecognitionService"
android:settingsActivity="com.thisux.droidclaw.MainActivity"
android:supportsAssist="true"
android:supportsLaunchVoiceAssistFromKeyguard="false" />

View File

@@ -0,0 +1,147 @@
# Voice Overlay — Design Document
**Date:** 2026-02-20
**Status:** Approved
**Approach:** Stream audio over existing WebSocket (Approach A)
## Overview
Add a voice-activated overlay to DroidClaw's Android app. User taps the floating pill → full-screen glowing gradient border appears → speech is streamed to the server for real-time transcription → live text appears on screen → tap Send to execute as a goal.
## User Flow
```
[IDLE] → tap pill → [LISTENING] → tap send → [EXECUTING] → done → [IDLE]
tap cancel
[IDLE]
```
### States
**IDLE** — Existing floating pill: `● Ready`, draggable, tappable.
**LISTENING** — Pill disappears. Full-screen overlay:
- Animated gradient border around all 4 screen edges (purple → blue → cyan → green cycle, ~3s)
- Large transcribed text in center, updating live word-by-word
- Bottom: `Send` (primary) + `Cancel` (secondary) buttons
- Audio recording starts immediately on transition
**EXECUTING** — Overlay collapses back to pill. Pill shows agent progress as today.
**IDLE (post-completion)** — Pill shows `● Done` for 3s, then `● Ready`.
## Audio Streaming Protocol
### Android → Server
| Message | Description |
|---------|-------------|
| `{type: "voice_start"}` | Recording begun |
| `{type: "voice_chunk", data: "<base64>"}` | ~100ms PCM chunks, 16kHz mono 16-bit |
| `{type: "voice_stop", action: "send"}` | User tapped Send — finalize & execute goal |
| `{type: "voice_stop", action: "cancel"}` | User tapped Cancel — discard |
### Server → Android
| Message | Description |
|---------|-------------|
| `{type: "transcript_partial", text: "..."}` | Live streaming partial transcript |
| `{type: "transcript_final", text: "..."}` | Final complete transcript |
### Flow
1. Android sends `voice_start` → server opens streaming connection to Groq Whisper
2. Android streams `voice_chunk` every ~100ms → server pipes PCM to Groq
3. Groq sends partial transcriptions → server relays as `transcript_partial`
4. User taps Send → Android sends `voice_stop` with `action: "send"`
5. Server flushes final audio → gets `transcript_final` → sends to Android → fires goal into agent loop
6. Cancel: `voice_stop` with `action: "cancel"` → server discards Groq session, no goal
### Audio Format
- Sample rate: 16kHz
- Channels: mono
- Bit depth: 16-bit PCM (linear16)
- Bandwidth: ~32KB/sec
- Encoding for WebSocket: base64 text frames
## Full-Screen Gradient Overlay
Two separate overlay layers managed by `AgentOverlay`:
### Layer 1 — Gradient Border (non-interactive)
- `TYPE_APPLICATION_OVERLAY` with `FLAG_NOT_TOUCHABLE | FLAG_NOT_FOCUSABLE`
- `MATCH_PARENT` — covers entire screen
- Compose renders animated gradient strips (~6dp) along all 4 edges
- Colors: purple → blue → cyan → green → purple, infinite rotation ~3s cycle
- Implementation: `drawBehind` modifier with 4 `LinearGradient` brushes, animated offset via `rememberInfiniteTransition`
- Center is fully transparent — pass-through to apps behind
### Layer 2 — Text + Buttons (interactive)
- `TYPE_APPLICATION_OVERLAY` with `FLAG_NOT_FOCUSABLE` (tappable, no keyboard steal)
- Positioned at bottom ~40% of screen
- Semi-transparent dark background `Color(0xCC000000)`
- Contents:
- Transcribed text: 24-28sp, white, center-aligned, auto-scrolls
- Subtle pulse/waveform animation while listening
- Bottom row: `Send` button (accent) + `Cancel` button (muted)
### Why Two Layers
Android overlays cannot be partially touchable. The gradient border must be `FLAG_NOT_TOUCHABLE` (pass-through) while the text/button area must be tappable. Separate `WindowManager` views with different flags solve this.
## Server-Side STT Handler
New file: `src/voice.ts`
### Responsibilities
- On `voice_start`: open Groq Whisper streaming connection
- On `voice_chunk`: pipe decoded PCM to Groq stream
- On `voice_stop` (send): flush stream, get final transcript, trigger `runAgent()` with transcript as goal
- On `voice_stop` (cancel): close Groq stream, discard
### Fallback
If Groq streaming is unavailable, buffer all chunks server-side. On `voice_stop`, send complete audio as single Whisper API call. No live words — final text appears all at once. Always works.
### Goal Execution
After `transcript_final`, call existing `runAgent()` from `kernel.ts` — identical to web dashboard goals. No changes to agent loop.
## Files Changed
| File | Change | Scope |
|------|--------|-------|
| `android/.../AndroidManifest.xml` | Add `RECORD_AUDIO` permission | Minor |
| `android/.../overlay/AgentOverlay.kt` | State machine: idle/listening/executing, manage 2 overlay layers | Major |
| `android/.../overlay/OverlayContent.kt` | New composables: `GradientBorder`, `VoiceOverlayContent`, `LiveTranscriptText` | Major |
| `android/.../overlay/VoiceRecorder.kt` | **New file.** `AudioRecord` capture + chunked base64 streaming | New |
| `android/.../connection/ConnectionService.kt` | Handle voice messages, route transcript events to overlay | Medium |
| `android/.../model/Protocol.kt` | New message data classes for voice protocol | Minor |
| `src/voice.ts` | **New file.** Groq Whisper streaming STT handler | New |
| `src/kernel.ts` | Route voice WebSocket messages to `voice.ts` | Minor |
### Untouched
`actions.ts`, `skills.ts`, `workflow.ts`, `sanitizer.ts`, `llm-providers.ts`, `config.ts`, `constants.ts`
## Permissions
- `RECORD_AUDIO` — new runtime permission, requested on first voice activation
- `SYSTEM_ALERT_WINDOW` — already granted (existing overlay)
- `INTERNET` — already granted
## Difficulty Assessment
**Overall: Medium.** Estimated 3-4 days.
- Android `AudioRecord` → WebSocket streaming: well-documented, straightforward
- Full-screen gradient overlay animation: standard Compose `Canvas` + `rememberInfiniteTransition`
- Groq Whisper streaming API: documented, Bun handles WebSocket/HTTP streaming natively
- Two-layer overlay management: minor complexity in `AgentOverlay` state machine
- No risky unknowns — all components have clear precedents

File diff suppressed because it is too large Load Diff

View File

@@ -8,7 +8,11 @@ export type DeviceMessage =
| { type: "pong" }
| { type: "heartbeat"; batteryLevel: number; isCharging: boolean }
| { type: "apps"; apps: InstalledApp[] }
| { type: "stop_goal" };
| { type: "stop_goal" }
// Voice overlay
| { type: "voice_start" }
| { type: "voice_chunk"; data: string }
| { type: "voice_stop"; action: "send" | "cancel" };
export type ServerToDeviceMessage =
| { type: "auth_ok"; deviceId: string }
@@ -35,7 +39,10 @@ export type ServerToDeviceMessage =
| { type: "intent"; requestId: string; intentAction: string; intentUri?: string; intentType?: string; intentExtras?: Record<string, string>; packageName?: string }
| { type: "ping" }
| { type: "goal_started"; sessionId: string; goal: string }
| { type: "goal_completed"; sessionId: string; success: boolean; stepsUsed: number };
| { type: "goal_completed"; sessionId: string; success: boolean; stepsUsed: number }
// Voice overlay
| { type: "transcript_partial"; text: string }
| { type: "transcript_final"; text: string };
export type DashboardMessage =
| { type: "device_online"; deviceId: string; name: string }

View File

@@ -55,12 +55,18 @@ license.post("/activate", async (c) => {
// Determine plan from benefit ID or default to "ltd"
const plan = "ltd";
// Activate the key (tracks activation count on Polar's side)
// Activate the key (may fail if already activated from previous attempt)
try {
await polar.licenseKeys.activate({
key,
organizationId: env.POLAR_ORGANIZATION_ID,
label: `${currentUser.email}`,
});
} catch (activateErr) {
const msg = activateErr instanceof Error ? activateErr.message : String(activateErr);
if (!msg.includes("limit")) throw activateErr;
console.log(`[License] Key already activated for ${currentUser.email}, storing anyway`);
}
// Store on user record
await db
@@ -140,12 +146,19 @@ license.post("/activate-checkout", async (c) => {
return c.json({ error: "No license key found for this purchase" }, 400);
}
// 3. Activate the key
// 3. Activate the key (may fail if already activated from previous attempt)
try {
await polar.licenseKeys.activate({
key: customerKey.key,
organizationId: env.POLAR_ORGANIZATION_ID,
label: `${currentUser.email}`,
});
} catch (activateErr) {
const msg = activateErr instanceof Error ? activateErr.message : String(activateErr);
if (!msg.includes("limit")) throw activateErr;
// Limit reached = key was already activated, that's fine — proceed to store
console.log(`[License] Key already activated for ${currentUser.email}, storing anyway`);
}
// 4. Store on user record
await db
@@ -161,6 +174,14 @@ license.post("/activate-checkout", async (c) => {
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
console.error(`[License] Checkout activation failed for ${currentUser.email}:`, message);
if (message.includes("limit")) {
return c.json({ error: "License key activation limit reached" }, 400);
}
if (message.includes("not found") || message.includes("invalid")) {
return c.json({ error: "Invalid or expired checkout" }, 400);
}
return c.json({ error: "Failed to activate from checkout" }, 500);
}
});

View File

@@ -6,6 +6,12 @@ import { apikey, llmConfig, device } from "../schema.js";
import { sessions, type WebSocketData } from "./sessions.js";
import { runPipeline } from "../agent/pipeline.js";
import type { LLMConfig } from "../agent/llm.js";
import {
handleVoiceStart,
handleVoiceChunk,
handleVoiceSend,
handleVoiceCancel,
} from "./voice.js";
/**
* Hash an API key the same way better-auth does:
@@ -361,6 +367,110 @@ export async function handleDeviceMessage(
break;
}
case "voice_start": {
const deviceId = ws.data.deviceId!;
const userId = ws.data.userId!;
// Fetch user's LLM config to get API key for Groq Whisper
const configs = await db
.select()
.from(llmConfig)
.where(eq(llmConfig.userId, userId))
.limit(1);
if (configs.length === 0 || !configs[0].apiKey) {
sendToDevice(ws, { type: "transcript_final", text: "" });
break;
}
handleVoiceStart(ws, deviceId, configs[0].apiKey);
break;
}
case "voice_chunk": {
const deviceId = ws.data.deviceId!;
handleVoiceChunk(deviceId, (msg as unknown as { data: string }).data);
break;
}
case "voice_stop": {
const deviceId = ws.data.deviceId!;
const userId = ws.data.userId!;
const voiceAction = (msg as unknown as { action: string }).action;
if (voiceAction === "cancel") {
handleVoiceCancel(deviceId);
break;
}
// action === "send" — finalize and fire goal
const configs = await db
.select()
.from(llmConfig)
.where(eq(llmConfig.userId, userId))
.limit(1);
if (configs.length === 0 || !configs[0].apiKey) {
handleVoiceCancel(deviceId);
sendToDevice(ws, { type: "transcript_final", text: "" });
break;
}
const groqKey = configs[0].apiKey;
const transcript = await handleVoiceSend(ws, deviceId, groqKey);
if (transcript) {
const persistentDeviceId = ws.data.persistentDeviceId!;
if (activeSessions.has(deviceId)) {
sendToDevice(ws, { type: "goal_failed", message: "Agent already running" });
break;
}
const userLlmConfig: LLMConfig = {
provider: configs[0].provider,
apiKey: configs[0].apiKey,
model: configs[0].model ?? undefined,
};
console.log(`[Pipeline] Starting voice goal for device ${deviceId}: ${transcript}`);
const abortController = new AbortController();
activeSessions.set(deviceId, { goal: transcript, abort: abortController });
sendToDevice(ws, { type: "goal_started", sessionId: deviceId, goal: transcript });
runPipeline({
deviceId,
persistentDeviceId,
userId,
goal: transcript,
llmConfig: userLlmConfig,
signal: abortController.signal,
onStep(step) {
sendToDevice(ws, {
type: "step",
step: step.stepNumber,
action: step.action,
reasoning: step.reasoning,
});
},
onComplete(result) {
activeSessions.delete(deviceId);
sendToDevice(ws, {
type: "goal_completed",
success: result.success,
stepsUsed: result.stepsUsed,
});
},
}).catch((err) => {
activeSessions.delete(deviceId);
sendToDevice(ws, { type: "goal_failed", message: String(err) });
});
}
break;
}
default: {
console.warn(
`Unknown message type from device ${ws.data.deviceId}:`,
@@ -384,6 +494,7 @@ export function handleDeviceClose(
active.abort.abort();
activeSessions.delete(deviceId);
}
handleVoiceCancel(deviceId);
sessions.removeDevice(deviceId);
// Update device status in DB

233
server/src/ws/voice.ts Normal file
View File

@@ -0,0 +1,233 @@
import type { ServerWebSocket } from "bun";
import type { WebSocketData } from "./sessions.js";
// ── Types ────────────────────────────────────────────────
interface VoiceSession {
chunks: Buffer[];
totalBytes: number;
partialTimer: ReturnType<typeof setInterval> | null;
lastPartialOffset: number;
}
// ── State ────────────────────────────────────────────────
const activeSessions = new Map<string, VoiceSession>();
// ── Audio constants ──────────────────────────────────────
const SAMPLE_RATE = 16_000;
const CHANNELS = 1;
const BITS_PER_SAMPLE = 16;
const PARTIAL_INTERVAL_MS = 2_000;
/** Minimum bytes before attempting first transcription (100ms of 16kHz mono 16-bit) */
const MIN_AUDIO_BYTES = 3_200;
// ── Exported handlers ────────────────────────────────────
/**
* Start a voice session for a device. Creates a buffer and starts a
* periodic timer that sends accumulated audio to Groq Whisper for
* partial transcripts every ~2s.
*/
export function handleVoiceStart(
ws: ServerWebSocket<WebSocketData>,
deviceId: string,
groqApiKey: string
): void {
// Clean up any existing session for this device
cleanupSession(deviceId);
const session: VoiceSession = {
chunks: [],
totalBytes: 0,
partialTimer: null,
lastPartialOffset: 0,
};
activeSessions.set(deviceId, session);
// Start periodic partial transcription
session.partialTimer = setInterval(async () => {
// Only transcribe if there's new audio since last partial
if (session.totalBytes <= session.lastPartialOffset) return;
if (session.totalBytes < MIN_AUDIO_BYTES) return;
try {
const pcm = concatChunks(session.chunks);
const text = await transcribeAudio(pcm, groqApiKey);
session.lastPartialOffset = session.totalBytes;
if (text) {
sendToDevice(ws, { type: "transcript_partial", text });
}
} catch (err) {
console.error(`[Voice] Partial transcription failed for ${deviceId}:`, err);
}
}, PARTIAL_INTERVAL_MS);
}
/**
* Append a base64-encoded PCM audio chunk to the session buffer.
*/
export function handleVoiceChunk(deviceId: string, base64Data: string): void {
const session = activeSessions.get(deviceId);
if (!session) {
console.warn(`[Voice] Chunk received for unknown session: ${deviceId}`);
return;
}
const decoded = Buffer.from(base64Data, "base64");
session.chunks.push(decoded);
session.totalBytes += decoded.length;
}
/**
* Stop the partial timer, send the complete audio to Groq for a final
* transcript, relay it to the device, clean up, and return the text.
*/
export async function handleVoiceSend(
ws: ServerWebSocket<WebSocketData>,
deviceId: string,
groqApiKey: string
): Promise<string> {
const session = activeSessions.get(deviceId);
if (!session) {
console.warn(`[Voice] Send requested for unknown session: ${deviceId}`);
return "";
}
// Stop partial timer
if (session.partialTimer !== null) {
clearInterval(session.partialTimer);
session.partialTimer = null;
}
let transcript = "";
if (session.totalBytes >= MIN_AUDIO_BYTES) {
try {
const pcm = concatChunks(session.chunks);
transcript = await transcribeAudio(pcm, groqApiKey);
} catch (err) {
console.error(`[Voice] Final transcription failed for ${deviceId}:`, err);
}
}
sendToDevice(ws, { type: "transcript_final", text: transcript });
// Clean up session
activeSessions.delete(deviceId);
return transcript;
}
/**
* Cancel a voice session: stop the timer and discard all audio.
*/
export function handleVoiceCancel(deviceId: string): void {
cleanupSession(deviceId);
}
// ── Internal helpers ─────────────────────────────────────
/**
* Concatenate all buffered chunks into a single Buffer.
*/
function concatChunks(chunks: Buffer[]): Buffer {
return Buffer.concat(chunks);
}
/**
* Clean up and remove a voice session.
*/
function cleanupSession(deviceId: string): void {
const session = activeSessions.get(deviceId);
if (!session) return;
if (session.partialTimer !== null) {
clearInterval(session.partialTimer);
}
activeSessions.delete(deviceId);
}
/**
* Wrap raw PCM data in a WAV container and send it to Groq's
* Whisper API for transcription. Returns the transcribed text.
*/
async function transcribeAudio(pcmBuffer: Buffer, apiKey: string): Promise<string> {
const wav = pcmToWav(pcmBuffer, SAMPLE_RATE, CHANNELS, BITS_PER_SAMPLE);
const formData = new FormData();
const wavBytes = new Uint8Array(wav.buffer, wav.byteOffset, wav.byteLength) as BlobPart;
formData.append("file", new Blob([wavBytes], { type: "audio/wav" }), "audio.wav");
formData.append("model", "whisper-large-v3");
const response = await fetch("https://api.groq.com/openai/v1/audio/transcriptions", {
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
},
body: formData,
});
if (!response.ok) {
const body = await response.text();
throw new Error(`Groq Whisper API error ${response.status}: ${body}`);
}
const result = (await response.json()) as { text: string };
return result.text ?? "";
}
/**
* Create a 44-byte WAV header + PCM data buffer.
*/
function pcmToWav(
pcm: Buffer,
sampleRate: number,
channels: number,
bitsPerSample: number
): Buffer {
const byteRate = (sampleRate * channels * bitsPerSample) / 8;
const blockAlign = (channels * bitsPerSample) / 8;
const dataSize = pcm.length;
const headerSize = 44;
const buffer = Buffer.alloc(headerSize + dataSize);
// RIFF header
buffer.write("RIFF", 0);
buffer.writeUInt32LE(36 + dataSize, 4); // file size - 8
buffer.write("WAVE", 8);
// fmt subchunk
buffer.write("fmt ", 12);
buffer.writeUInt32LE(16, 16); // subchunk1 size (PCM = 16)
buffer.writeUInt16LE(1, 20); // audio format (PCM = 1)
buffer.writeUInt16LE(channels, 22);
buffer.writeUInt32LE(sampleRate, 24);
buffer.writeUInt32LE(byteRate, 28);
buffer.writeUInt16LE(blockAlign, 32);
buffer.writeUInt16LE(bitsPerSample, 34);
// data subchunk
buffer.write("data", 36);
buffer.writeUInt32LE(dataSize, 40);
// PCM data
pcm.copy(buffer, headerSize);
return buffer;
}
/**
* Send a JSON message to a device WebSocket (safe — catches send errors).
*/
function sendToDevice(ws: ServerWebSocket<WebSocketData>, msg: Record<string, unknown>): void {
try {
ws.send(JSON.stringify(msg));
} catch {
// device disconnected
}
}

View File

@@ -422,15 +422,18 @@
<!-- ─── hero ─── -->
<section class="hero">
<div class="wrap">
<div class="badge"><iconify-icon icon="ph:flask-duotone" width="14" height="14" style="color:var(--amber)"></iconify-icon> experimental</div>
<a href="https://app.droidclaw.ai" class="badge" style="text-decoration:none;cursor:pointer;transition:border-color .15s;"><iconify-icon icon="ph:rocket-launch-duotone" width="14" height="14" style="color:var(--green)"></iconify-icon> now live &mdash; sign up &amp; start controlling your device <iconify-icon icon="ph:arrow-right" width="12" height="12" style="color:var(--text-muted)"></iconify-icon></a>
<h1>turn old phones into<br><span class="glow">ai agents</span></h1>
<p class="subtitle">
give it a goal in plain english. it reads the screen, thinks about what to do,
taps and types via adb, and repeats until the job is done.
</p>
<div class="hero-actions">
<a href="#getting-started" class="btn-primary">
<iconify-icon icon="ph:rocket-launch-duotone" width="16" height="16"></iconify-icon> get started
<a href="https://github.com/unitedbyai/droidclaw/releases/download/v0.3.1/app-debug.apk" class="btn-primary">
<iconify-icon icon="ph:android-logo-duotone" width="16" height="16"></iconify-icon> download apk
</a>
<a href="https://app.droidclaw.ai" class="btn-secondary">
<iconify-icon icon="ph:cloud-duotone" width="16" height="16"></iconify-icon> open dashboard
</a>
<a href="https://github.com/unitedbyai/droidclaw" class="btn-secondary">
<iconify-icon icon="ph:github-logo-duotone" width="16" height="16"></iconify-icon> view source
@@ -816,13 +819,23 @@ GROQ_API_KEY=gsk_your_key_here
</div>
<div class="stepper-step">
<span class="stepper-num">3</span>
<h3>install the android app</h3>
<p>download and install the companion app on your android device.</p>
<div style="margin-top: 12px;">
<a href="https://github.com/unitedbyai/droidclaw/releases/download/v0.3.1/app-debug.apk" class="btn-primary" style="display: inline-flex;">
<iconify-icon icon="ph:android-logo-duotone" width="16" height="16"></iconify-icon> download apk (v0.3.1)
</a>
</div>
</div>
<div class="stepper-step">
<span class="stepper-num">4</span>
<h3>connect your phone</h3>
<p>enable usb debugging in developer options, plug in via usb.</p>
<pre>adb devices # should show your device
cd droidclaw && bun run src/kernel.ts</pre>
</div>
<div class="stepper-step">
<span class="stepper-num">4</span>
<span class="stepper-num">5</span>
<h3>tune (optional)</h3>
<table>
<thead><tr><th>key</th><th>default</th><th>what</th></tr></thead>

View File

@@ -1,69 +1,44 @@
import { form, query, getRequestEvent } from '$app/server';
import { form, command, getRequestEvent } from '$app/server';
import { redirect } from '@sveltejs/kit';
import { db } from '$lib/server/db';
import { user } from '$lib/server/db/schema';
import { eq } from 'drizzle-orm';
import { env } from '$env/dynamic/private';
import * as v from 'valibot';
import { activateLicenseSchema, activateCheckoutSchema } from '$lib/schema/license';
export const getLicenseStatus = query(async () => {
/** Forward a request to the DroidClaw server with internal auth */
async function serverFetch(path: string, body: Record<string, unknown>) {
const { locals } = getRequestEvent();
if (!locals.user) return null;
if (!locals.user) throw new Error('Not authenticated');
const rows = await db
.select({ plan: user.plan, polarLicenseKey: user.polarLicenseKey })
.from(user)
.where(eq(user.id, locals.user.id))
.limit(1);
const serverUrl = env.SERVER_URL || 'http://localhost:8080';
const internalSecret = env.INTERNAL_SECRET || '';
const row = rows[0];
return {
activated: !!row?.plan,
plan: row?.plan ?? null,
licenseKey: row?.polarLicenseKey ?? null
};
});
const res = await fetch(`${serverUrl}${path}`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-internal-secret': internalSecret,
'x-internal-user-id': locals.user.id
},
body: JSON.stringify(body)
});
const data = await res.json().catch(() => ({ error: 'Unknown error' }));
if (!res.ok) throw new Error(data.error ?? `Error ${res.status}`);
return data;
}
export const activateLicense = form(activateLicenseSchema, async (data) => {
const { locals } = getRequestEvent();
if (!locals.user) return;
const serverUrl = env.SERVER_URL || 'http://localhost:8080';
const internalSecret = env.INTERNAL_SECRET || '';
const res = await fetch(`${serverUrl}/license/activate`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-internal-secret': internalSecret,
'x-internal-user-id': locals.user.id
},
body: JSON.stringify({ key: data.key })
});
if (res.ok) {
await serverFetch('/license/activate', { key: data.key });
redirect(303, '/dashboard');
}
});
export const activateFromCheckout = form(activateCheckoutSchema, async (data) => {
const { locals } = getRequestEvent();
if (!locals.user) return;
const serverUrl = env.SERVER_URL || 'http://localhost:8080';
const internalSecret = env.INTERNAL_SECRET || '';
const res = await fetch(`${serverUrl}/license/activate-checkout`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-internal-secret': internalSecret,
'x-internal-user-id': locals.user.id
},
body: JSON.stringify({ checkoutId: data.checkoutId })
});
if (res.ok) {
redirect(303, '/dashboard');
export const activateFromCheckout = command(
v.object({ checkoutId: v.string() }),
async ({ checkoutId }) => {
const result = await serverFetch('/license/activate-checkout', { checkoutId });
return result;
}
});
);

View File

@@ -1,7 +1,7 @@
import { env } from '$env/dynamic/private';
import { UseSend } from 'usesend-js';
const EMAIL_FROM = 'noreply@app.droidclaw.ai';
const EMAIL_FROM = 'DroidClaw <noreply@app.droidclaw.ai>';
function getClient() {
if (!env.USESEND_API_KEY) throw new Error('USESEND_API_KEY is not set');

View File

@@ -1,17 +1,39 @@
<script lang="ts">
import { activateLicense, activateFromCheckout } from '$lib/api/license.remote';
import { goto } from '$app/navigation';
import { page } from '$app/state';
import { onMount } from 'svelte';
import Icon from '@iconify/svelte';
import { LICENSE_ACTIVATE_CHECKOUT, LICENSE_ACTIVATE_MANUAL, LICENSE_PURCHASE_CLICK } from '$lib/analytics/events';
const checkoutId = page.url.searchParams.get('checkout_id');
let showKeyInput = $state(false);
let checkoutStatus = $state<'activating' | 'error' | 'idle'>('idle');
let checkoutError = $state('');
async function activateCheckout() {
if (!checkoutId) return;
checkoutStatus = 'activating';
checkoutError = '';
try {
await activateFromCheckout({ checkoutId });
goto('/dashboard');
} catch (e: any) {
checkoutError = e.message ?? 'Failed to activate from checkout';
checkoutStatus = 'error';
}
}
onMount(() => {
if (checkoutId) activateCheckout();
});
</script>
{#if checkoutId}
<!-- Auto-activate from Polar checkout -->
<div class="mx-auto max-w-md pt-20">
{#if checkoutStatus === 'activating'}
<div class="mb-8 text-center">
<div class="mx-auto mb-4 flex h-12 w-12 items-center justify-center rounded-xl bg-neutral-100">
<Icon icon="ph:spinner-duotone" class="h-6 w-6 animate-spin text-neutral-600" />
@@ -21,18 +43,24 @@
We're setting up your account. This will only take a moment.
</p>
</div>
{:else if checkoutStatus === 'error'}
<div class="mb-8 text-center">
<div class="mx-auto mb-4 flex h-12 w-12 items-center justify-center rounded-xl bg-red-50">
<Icon icon="ph:warning-duotone" class="h-6 w-6 text-red-500" />
</div>
<h2 class="text-2xl font-bold">Activation failed</h2>
<p class="mt-1 text-neutral-500">{checkoutError}</p>
</div>
<form {...activateFromCheckout} class="space-y-4">
<input type="hidden" {...activateFromCheckout.fields.checkoutId.as('text')} value={checkoutId} />
<button
type="submit"
onclick={activateCheckout}
data-umami-event={LICENSE_ACTIVATE_CHECKOUT}
class="flex w-full items-center justify-center gap-2 rounded-xl bg-neutral-900 px-4 py-2.5 font-medium text-white hover:bg-neutral-800"
>
<Icon icon="ph:seal-check-duotone" class="h-4 w-4" />
Activate Now
<Icon icon="ph:arrow-clockwise-duotone" class="h-4 w-4" />
Retry
</button>
</form>
{/if}
<div class="mt-6 rounded-xl border border-neutral-200 bg-neutral-50 p-4">
<p class="text-center text-sm text-neutral-500">
@@ -84,7 +112,7 @@
</div>
<a
href="https://sandbox-api.polar.sh/v1/checkout-links/polar_cl_5pGavRIJJhM8ge6p0UaeaadT2bCiqL04CYXgW3bwVac/redirect"
href="https://buy.polar.sh/polar_cl_jCKKpL4dSdvLZr9H6JYeeCiXjTH98Rf9b4kKM2VqvG2"
data-umami-event={LICENSE_PURCHASE_CLICK}
class="flex w-full items-center justify-center gap-2 rounded-xl bg-neutral-900 px-4 py-3 text-base font-medium text-white hover:bg-neutral-800"
>

View File

@@ -103,14 +103,23 @@
<p class="mt-1 text-sm text-neutral-400">
Install the Android app, paste your API key, and your device will appear here.
</p>
<div class="mt-4 flex flex-col items-center gap-3">
<a
href="https://github.com/unitedbyai/droidclaw/releases/download/v0.3.1/app-debug.apk"
class="inline-flex items-center gap-1.5 rounded-lg bg-neutral-900 px-4 py-2 text-sm font-medium text-white hover:bg-neutral-800"
>
<Icon icon="ph:android-logo-duotone" class="h-4 w-4" />
Download APK
</a>
<a
href="/dashboard/api-keys"
class="mt-4 inline-flex items-center gap-1.5 text-sm font-medium text-neutral-700 hover:text-neutral-900"
class="inline-flex items-center gap-1.5 text-sm font-medium text-neutral-500 hover:text-neutral-700"
>
<Icon icon="ph:key-duotone" class="h-4 w-4" />
Create an API key
</a>
</div>
</div>
{:else}
<div class="grid grid-cols-2 gap-6 sm:grid-cols-3 lg:grid-cols-4 xl:grid-cols-5">
{#each devices as d (d.deviceId)}