Compare commits

6 Commits

Author SHA1 Message Date
Roberto
bff26c07db chore: bump submodules — post-review fixes
Fixes from final code review:
- Cloud scout CRUD URLs corrected (/api/v1/scouts/cloud)
- Note summarization URL corrected (/api/v1/scouts/notes/summarize)
- GmailConnector.fetch_content now uses single-message Gmail get instead of bulk fetch
- scout_proposal ack now sent only after successful local persist

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 05:40:22 +02:00
Roberto
f5fc25ebce chore: bump submodules — Phase 3 Gmail scout end-to-end
Phase 3 ships:
- GmailConnector implementation (list_new, fetch_metadata, fetch_content, archive, setup_watch, renew_watch)
- Connector registration at app startup
- Real triage LLM via scout-triage-system Langfuse prompt
- Pub/Sub webhook with JWT verification (dev-mode skip when GMAIL_PUBSUB_AUDIENCE empty)
- Cron-fallback poll + Gmail watch renewal in APScheduler lifespan
- Settings UI: Connect Gmail OAuth flow with separate gmail.readonly+modify scopes
- Deep-link callback handler adiuvai://scout/oauth/gmail/callback
- i18n keys scouts.connectGmail + toast.scout.gmailConnected in all 5 languages

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 04:55:48 +02:00
Roberto
6eadd61f62 chore: bump submodules — Phase 2 connector skeleton
Phase 2 ships:
- scout_triage_queue Postgres table + cloud_scout_configs gmail fields
- ScoutTriageQueue SQLAlchemy model
- SourceConnector Protocol + connector registry
- ScoutEngine: trigger_scout, _process_item (stub _triage_llm), deliver_pending, ack_proposal
- WS frame contract: scout_proposal + scout_proposal_ack
- Electron scout_suggestions SQLite table
- Electron scout-suggestion-handler

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 03:48:54 +02:00
Roberto
596e5551f8 chore: bump submodules — Phase 1 scouts rename complete
api: agent_ids → scout_ids in device_hello WS frame + tests
adiuvAI: CloudAgentConfig → CloudScoutConfig, agentIds → scoutIds
.claude/CLAUDE.md: update all scout-subsystem doc references

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 01:50:48 +02:00
Roberto
445a4cbbf9 docs: add scouts refactor + gmail scout implementation plan
Covers Phases 1-3 (rename, connector skeleton, Gmail end-to-end) as
28 TDD tasks. Phase 4 (Stage 2 categorization + brief HITL) deferred
to separate spec.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 23:32:40 +02:00
Roberto
732a4c42f8 docs: add scouts refactor + gmail scout design spec
Phases 1-3 in scope: rename agents → scouts (UI/code/Postgres/SQLite/
Langfuse), Gmail cloud scout w/ two-stage pipeline, SourceConnector
abstraction. Phase 4 (Stage 2 categorization + HITL surface in brief)
deferred to task-brief rework.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 23:15:46 +02:00
5 changed files with 3423 additions and 16 deletions

View File

@@ -76,7 +76,7 @@ Main Process (Node.js)
**Local-first storage, cloud AI.** All user data (clients, projects, tasks, notes, timelines) in local SQLite. AI lives entirely on the FastAPI backend — Electron orchestrator is a thin delegation shell that forwards to `/api/v1/device` WS and dispatches v3 typed stream frames + tool-call ↔ DrizzleExecutor round-trips back to renderer. **Local-first storage, cloud AI.** All user data (clients, projects, tasks, notes, timelines) in local SQLite. AI lives entirely on the FastAPI backend — Electron orchestrator is a thin delegation shell that forwards to `/api/v1/device` WS and dispatches v3 typed stream frames + tool-call ↔ DrizzleExecutor round-trips back to renderer.
**IPC channels**: **IPC channels**:
- `'trpc'` — bidirectional tRPC request/response (all CRUD + auth + agent + memory proxy) - `'trpc'` — bidirectional tRPC request/response (all CRUD + auth + scout + memory proxy)
- `'ai:stream'` — one-way v3 stream frames main → renderer - `'ai:stream'` — one-way v3 stream frames main → renderer
- `'ai:action'` — AI side-effects (e.g. agent auto-creates task) - `'ai:action'` — AI side-effects (e.g. agent auto-creates task)
@@ -85,19 +85,19 @@ Main Process (Node.js)
- `ipc.ts` — Custom tRPC↔IPC bridge - `ipc.ts` — Custom tRPC↔IPC bridge
- `store.ts` — electron-store for `FormatPrefs` + `uiLanguage`; exports `getUiLanguage()` - `store.ts` — electron-store for `FormatPrefs` + `uiLanguage`; exports `getUiLanguage()`
- `router/index.ts` — All tRPC sub-routers (~1627 LOC) - `router/index.ts` — All tRPC sub-routers (~1627 LOC)
- `db/schema.ts` — 10 tables: clients, projects, tasks, timelineEvents, timelineEventDependencies, notes, noteEdits, taskComments, agentRuns, agentRunActions - `db/schema.ts` — 10 tables: clients, projects, tasks, timelineEvents, timelineEventDependencies, notes, noteEdits, taskComments, scoutRuns, scoutRunActions
- `db/index.ts` — Drizzle + better-sqlite3 (WAL), singleton `getDb()`, `initDb()` migrations - `db/index.ts` — Drizzle + better-sqlite3 (WAL), singleton `getDb()`, `initDb()` migrations
- `db/notes-backfill.ts` — Startup backfill: generates `aiSummary` for notes with null summary - `db/notes-backfill.ts` — Startup backfill: generates `aiSummary` for notes with null summary
- `ai/orchestrator.ts` — Thin backend-delegation layer (~304 LOC). Connectivity/auth guard → `BackendClient.sendHomeRequest()` / `sendFloatingRequest()` → forwards v3 stream frames to renderer. Also schedules daily-brief regeneration. - `ai/orchestrator.ts` — Thin backend-delegation layer (~304 LOC). Connectivity/auth guard → `BackendClient.sendHomeRequest()` / `sendFloatingRequest()` → forwards v3 stream frames to renderer. Also schedules daily-brief regeneration.
- `ai/token.ts` — Two-tier token storage (safeStorage + electron-store fallback) - `ai/token.ts` — Two-tier token storage (safeStorage + electron-store fallback)
- `agents/agent-scheduler.ts` — Local agent scheduling (filesystem agents) - `scouts/scout-scheduler.ts` — Local scout scheduling (filesystem scouts)
- `api/backend-client.ts` — WS client to FastAPI: handles tool-call round-trips, v3 stream frame dispatch, journey + agent proxies - `api/backend-client.ts` — WS client to FastAPI: handles tool-call round-trips, v3 stream frame dispatch, journey + scout proxies
- `api/drizzle-executor.ts` — Executes backend-issued tool calls against local SQLite. Wraps results through `formatRow()`/`formatRows()` using user FormatPrefs - `api/drizzle-executor.ts` — Executes backend-issued tool calls against local SQLite. Wraps results through `formatRow()`/`formatRows()` using user FormatPrefs
- `auth/auth-manager.ts` — Login, register, logout, OAuth flow (singleton) - `auth/auth-manager.ts` — Login, register, logout, OAuth flow (singleton)
- `auth/backup-key.ts` — Device-specific AES-256 backup key (safeStorage, not password-derived) - `auth/backup-key.ts` — Device-specific AES-256 backup key (safeStorage, not password-derived)
- `auth/locale-defaults.ts` — Detects timezone, date/time format, language from OS locale - `auth/locale-defaults.ts` — Detects timezone, date/time format, language from OS locale
**tRPC routers** (in `appRouter`): `health`, `settings`, `clients`, `projects`, `tasks`, `timelineEvents`, `timelineEventDependencies`, `notes`, `noteEdits`, `taskComments`, `ai`, `auth`, `agent` (with `local` / `cloud` / `journey` sub-routers), `memory`. **tRPC routers** (in `appRouter`): `health`, `settings`, `clients`, `projects`, `tasks`, `timelineEvents`, `timelineEventDependencies`, `notes`, `noteEdits`, `taskComments`, `ai`, `auth`, `scout` (with `local` / `cloud` / `journey` sub-routers), `memory`.
**Renderer** (`src/renderer/`): file-based routing via TanStack Router (`routeTree.gen.ts` auto-generated). shadcn/ui new-york theme, neutral colors. Path alias `@/*``src/renderer/*`. Notes editor: Milkdown (`@milkdown/crepe`). **Renderer** (`src/renderer/`): file-based routing via TanStack Router (`routeTree.gen.ts` auto-generated). shadcn/ui new-york theme, neutral colors. Path alias `@/*``src/renderer/*`. Notes editor: Milkdown (`@milkdown/crepe`).
@@ -107,16 +107,16 @@ Main Process (Node.js)
- `forge.config.ts` has cross-compilation hooks (downloads platform-specific native binaries for better-sqlite3) - `forge.config.ts` has cross-compilation hooks (downloads platform-specific native binaries for better-sqlite3)
- DB has no foreign key constraints — cascade deletes in tRPC procedures - DB has no foreign key constraints — cascade deletes in tRPC procedures
- Timestamps are milliseconds (`Date.getTime()`), not ISO strings - Timestamps are milliseconds (`Date.getTime()`), not ISO strings
- Notes use `aiSummary` (≤250 char, backend `gpt-4o-mini` via `POST /api/v1/agents/notes/summarize`) for AI navigation — LanceDB fully removed - Notes use `aiSummary` (≤250 char, backend `gpt-4o-mini` via `POST /api/v1/scouts/notes/summarize`) for AI navigation — LanceDB fully removed
- AI note edits go through `noteEdits` HITL table (`type: append|insert|replace`, `status: pending|approved|rejected`); backend tool `propose_note_edit` → drizzle-executor inserts row; user approves/rejects in UI; auto-reject on missing anchor - AI note edits go through `noteEdits` HITL table (`type: append|insert|replace`, `status: pending|approved|rejected`); backend tool `propose_note_edit` → drizzle-executor inserts row; user approves/rejects in UI; auto-reject on missing anchor
- `checkpoints` table replaced by `timelineEvents` + `timelineEventDependencies` (events are typed `milestone|checkpoint|activity`, with optional dep edges) - `checkpoints` table replaced by `timelineEvents` + `timelineEventDependencies` (events are typed `milestone|checkpoint|activity`, with optional dep edges)
- `agentRuns` + `agentRunActions` populated by backend-client on tool_call/run_complete frames; UI reads via `agent.runs` / `agent.runActions` - `scoutRuns` + `scoutRunActions` populated by backend-client on tool_call/run_complete frames; UI reads via `scout.runs` / `scout.runActions`
**Settings Page (shared Electron + Web)**: **Settings Page (shared Electron + Web)**:
- Settings page runs in **both** Electron and standalone web SPA. Same React components — no duplication. - Settings page runs in **both** Electron and standalone web SPA. Same React components — no duplication.
- **Platform Adapter**: `PlatformProvider` context (`src/renderer/lib/platform.tsx`) exposes `isElectron`/`isWeb`/`hasLocalAgents`/`hasFileDialog`. Components use `usePlatform()` to gate Electron-only features. - **Platform Adapter**: `PlatformProvider` context (`src/renderer/lib/platform.tsx`) exposes `isElectron`/`isWeb`/`hasLocalAgents`/`hasFileDialog`. Components use `usePlatform()` to gate Electron-only features.
- **Web build**: `vite.web.config.mts``dist-web/`. Entry: `web.html``src/renderer/web-main.tsx` (uses `httpBatchLink` via `lib/httpLink.ts` instead of `ipcLink`). - **Web build**: `vite.web.config.mts``dist-web/`. Entry: `web.html``src/renderer/web-main.tsx` (uses `httpBatchLink` via `lib/httpLink.ts` instead of `ipcLink`).
- **Electron-only gating**: Device ID card and local agent filesystem gated behind `platform.isElectron`. On web: visible but disabled, not hidden. - **Electron-only gating**: Device ID card and local scout filesystem gated behind `platform.isElectron`. On web: visible but disabled, not hidden.
- **Gotcha**: Do NOT add Electron-specific settings (server URL, native file pickers) without wrapping in `platform.isElectron`. Same component tree renders on web. - **Gotcha**: Do NOT add Electron-specific settings (server URL, native file pickers) without wrapping in `platform.isElectron`. Same component tree renders on web.
**Onboarding Wizard**: **Onboarding Wizard**:
@@ -134,7 +134,7 @@ Main Process (Node.js)
**i18n (Internationalization)**: **i18n (Internationalization)**:
- `i18next` + `react-i18next` with bundled JSON translations (no lazy loading). - `i18next` + `react-i18next` with bundled JSON translations (no lazy loading).
- Config in `src/renderer/i18n.ts`. 5 languages: EN, IT, ES, FR, DE. `SUPPORTED_LANGUAGES` exported for UI selectors. - Config in `src/renderer/i18n.ts`. 5 languages: EN, IT, ES, FR, DE. `SUPPORTED_LANGUAGES` exported for UI selectors.
- Translation files: `src/renderer/locales/{en,it,es,fr,de}/translation.json`. Namespaces: `nav`, `auth`, `tasks`, `settings`, `common`, `errors`, `home`, `timeline`, `projects`, `agents`. - Translation files: `src/renderer/locales/{en,it,es,fr,de}/translation.json`. Namespaces: `nav`, `auth`, `tasks`, `settings`, `common`, `errors`, `home`, `timeline`, `projects`, `scouts`.
- **`common.*` namespace** holds shared labels (`save`, `cancel`, `delete`, `edit`, `add`, `rename`, `saving`, `deleting`, `creating`, `renameDescription`, `deleteTitle`). Check `common.*` before adding new key. - **`common.*` namespace** holds shared labels (`save`, `cancel`, `delete`, `edit`, `add`, `rename`, `saving`, `deleting`, `creating`, `renameDescription`, `deleteTitle`). Check `common.*` before adding new key.
- Pluralization uses i18next `_one`/`_other` suffixes. - Pluralization uses i18next `_one`/`_other` suffixes.
- `LanguageSync` component in `src/renderer/index.tsx` reads persisted `uiLanguage` from electron-store via tRPC on startup, syncs to i18next. - `LanguageSync` component in `src/renderer/index.tsx` reads persisted `uiLanguage` from electron-store via tRPC on startup, syncs to i18next.
@@ -191,8 +191,8 @@ FastAPI app (app/main.py)
├── HTTP Routes (app/api/routes/) — all under /api/v1 ├── HTTP Routes (app/api/routes/) — all under /api/v1
│ ├── auth.py — register, login, refresh, profile, OAuth, onboarding, password │ ├── auth.py — register, login, refresh, profile, OAuth, onboarding, password
│ ├── chat.py — POST /chat, /chat/brief, /chat/embed │ ├── chat.py — POST /chat, /chat/brief, /chat/embed
│ ├── agents.py — catalog, can-create, trigger, notes/summarize │ ├── scouts.py — catalog, can-create, trigger, notes/summarize
│ ├── agent_setup.py — guided agent setup (journey) │ ├── scout_setup.py — guided scout setup (journey)
│ ├── billing.py — Stripe checkout, webhook, subscription, invoices │ ├── billing.py — Stripe checkout, webhook, subscription, invoices
│ ├── device_ws.py — WS /device (unified streaming endpoint: home, floating, brief, journey) │ ├── device_ws.py — WS /device (unified streaming endpoint: home, floating, brief, journey)
│ └── memory.py — core / relational / forget-all │ └── memory.py — core / relational / forget-all
@@ -225,9 +225,9 @@ FastAPI app (app/main.py)
└── Models (app/models.py) — SQLAlchemy 2.0 ORM └── Models (app/models.py) — SQLAlchemy 2.0 ORM
``` ```
**HTTP route prefix**: every router included with `prefix="/api/v1"`. So `/api/v1/auth/...`, `/api/v1/chat`, `/api/v1/agents/...`, `/api/v1/memory/...`, `/api/v1/device` (WS). **HTTP route prefix**: every router included with `prefix="/api/v1"`. So `/api/v1/auth/...`, `/api/v1/chat`, `/api/v1/scouts/...`, `/api/v1/memory/...`, `/api/v1/device` (WS).
**ORM models** (`app/models.py`): `User`, `RefreshToken`, `OAuthAccount`, `Subscription`, `LocalAgentConfig`, `CloudAgentConfig`, `AgentRunLog`, `MemoryCore`, `MemoryAssociative`, `MemoryEpisodic`, `MemoryProactive`, `ExtractionQueue`, `MemoryRelation`, `Plugin`. PostgreSQL (asyncpg + SQLAlchemy 2.0 async). Alembic migrations in `alembic/versions/`. **ORM models** (`app/models.py`): `User`, `RefreshToken`, `OAuthAccount`, `Subscription`, `LocalScoutConfig`, `CloudScoutConfig`, `ScoutRunLog`, `MemoryCore`, `MemoryAssociative`, `MemoryEpisodic`, `MemoryProactive`, `ExtractionQueue`, `MemoryRelation`, `Plugin`. PostgreSQL (asyncpg + SQLAlchemy 2.0 async). Alembic migrations in `alembic/versions/`.
**Lifespan crons** (only if `settings.SCHEDULER_ENABLED`): **Lifespan crons** (only if `settings.SCHEDULER_ENABLED`):
- `_memory_cron_tick` — hourly: drains Free-tier extraction queue + mines proactive patterns for Power+ users - `_memory_cron_tick` — hourly: drains Free-tier extraction queue + mines proactive patterns for Power+ users
@@ -235,7 +235,7 @@ FastAPI app (app/main.py)
**LLM routing**: backend agents own all intelligence. Tool calls describe client-side ops (JSON) → Electron `drizzle-executor` runs them against local SQLite → result returned to backend over WS. Tool loop cap inside agent runner prevents runaway iteration. **LLM routing**: backend agents own all intelligence. Tool calls describe client-side ops (JSON) → Electron `drizzle-executor` runs them against local SQLite → result returned to backend over WS. Tool loop cap inside agent runner prevents runaway iteration.
**Zero-trust data model**: backend never stores raw user content. PostgreSQL holds auth, billing, plugin metadata, encrypted memory (Core/Associative/Episodic/Proactive/Relational), agent configs, run logs. **Zero-trust data model**: backend never stores raw user content. PostgreSQL holds auth, billing, plugin metadata, encrypted memory (Core/Associative/Episodic/Proactive/Relational), scout configs, run logs.
**Config**: `app/config/settings.py` — all env vars via Pydantic Settings. Copy `.env.example` to `.env` for local dev. **Config**: `app/config/settings.py` — all env vars via Pydantic Settings. Copy `.env.example` to `.env` for local dev.

Submodule adiuvAI updated: c1b1b289c1...1a4cfb07a5

2
api

Submodule api updated: 70c19d3064...0833db239c

View File

@@ -0,0 +1,327 @@
# Scouts Refactor + Gmail Integration — Design
**Date:** 2026-05-15
**Status:** Draft, awaiting user review
**Owner:** Roberto
## Summary
Rename the existing "Agents" subsystem to "Scouts" across the entire stack (UI, code, Postgres, SQLite, Langfuse), then add the first cloud scout — Gmail — using a two-stage pipeline that respects zero-trust (no email content stored on backend) and human-in-the-loop (no entities created autonomously).
The implementation is split into four phases. Phases 13 ship now. Phase 4 (Stage 2 categorization, HITL surface in the brief, conversion-to-entity mutations) is deferred to the planned task-brief rework.
## Goals
- Unify the user-facing "data source watchers" concept under one name: **Scout**.
- Land a `SourceConnector` abstraction so future cloud scouts (Slack/Teams/Outlook/RSS/...) reuse the same engine, queue, delivery channel, and HITL surface — only the per-source connector is new.
- Ship a Gmail scout end-to-end with: OAuth, push (`users.watch`) + cron-fallback polling, BE-side spam triage, encrypted token storage, opt-in spam auto-trash.
- Preserve zero-trust: Gmail bodies are fetched transiently for the triage LLM call and discarded; only `{message_id, scout_id, verdict, status}` is persisted on BE.
- Preserve HITL on the cloud path: scouts never create tasks/projects/events/notes autonomously; they accumulate proposals that the user resolves later from the brief.
## Non-Goals (Phase 4, separate spec)
- Stage 2 categorization agent prompt + tool palette.
- HITL UI in the task brief (suggestion cards, approve/reject controls, convert-to-entity mutations, `list_pending_scout_suggestions` brief tool).
- Local scout behavior change. Local directory monitor keeps current "auto-create" semantics. HITL is opt-in for local scouts in a future migration.
- Schema unification of `LocalScoutConfig` + `CloudScoutConfig`. They have different behaviors; keep separate tables.
- Connectors other than Gmail (Slack/Teams/Outlook).
- Stripe/billing changes (existing tier checks suffice).
## Constraints
- **Pre-1.0 dev**: no production users, no backwards-compatibility shims, no Alembic data migrations beyond rename. Drop-and-recreate is acceptable where simpler.
- **Zero-trust**: BE never persists user content. Gmail bodies are read transiently for the triage LLM call only.
- **HITL (cloud path)**: scouts produce proposals, never entities.
- **Spam auto-trash**: off by default per scout; opt-in via UI toggle. Action is "move to Trash" (Gmail's 30d recovery), never permanent delete.
- **Reusability**: cloud-scout pipeline (connector → triage → queue → deliver-on-connect → HITL) is shared infra; Gmail is just the first connector.
## Architecture
### Two-stage pipeline (cloud scouts only)
```
[Gmail] --push/cron--> [BE Stage 1: Triage] [Electron Stage 2: Categorize]
| |
v v
fetch body (transient) drain queue on WS reconnect
| |
v v
LLM relevance call fetch metadata for each msg
| |
+-- spam + auto_trash_spam: archive v
| insert scout_suggestions row
+-- relevant: insert queue row (category='unprocessed' stub
until Phase 4)
```
**Stage 1 (BE, always-on):** verdict only. Stores `{msg_id, verdict, status}`. No content.
**Stage 2 (Electron, on connect):** Phase 3 ships a stub that simply mirrors the queue into a local SQLite table with `category='unprocessed'`. Phase 4 swaps in the real categorization agent.
### Local scouts (unchanged behaviorally)
Local directory monitor keeps current Electron-side scheduling and auto-creation. Only renames apply.
### SourceConnector abstraction
A `SourceConnector` Protocol owns all source-specific I/O. The shared `ScoutEngine` owns triage, queueing, delivery, and ack handling. To add a new cloud scout: implement one connector class + register it.
```python
# app/scouts/connectors/base.py
class SourceConnector(Protocol):
source_type: str # "gmail"
async def list_new(self, scout: CloudScoutConfig) -> list[ItemRef]: ...
async def fetch_metadata(self, scout: CloudScoutConfig, ref: ItemRef) -> ItemMetadata: ...
async def fetch_content(self, scout: CloudScoutConfig, ref: ItemRef) -> ItemContent: ...
async def archive(self, scout: CloudScoutConfig, ref: ItemRef) -> None: ...
async def setup_watch(self, scout: CloudScoutConfig) -> None: ...
async def renew_watch(self, scout: CloudScoutConfig) -> None: ...
```
`ItemContent.body_text` is in-memory only; never persisted.
### ScoutEngine
```python
class ScoutEngine:
async def trigger_scout(self, scout_id: UUID) -> None: ...
async def _process_item(self, scout, connector, ref) -> None: ...
async def deliver_pending(self, user_id: UUID, ws: DeviceWS) -> None: ...
```
Both webhook and cron-fallback entry points call `trigger_scout`.
## Data Model
### Postgres (BE)
#### Renames (Phase 1, single Alembic migration)
| Before | After |
|--------------------------------|--------------------------------|
| Table `local_agent_configs` | `local_scout_configs` |
| Table `cloud_agent_configs` | `cloud_scout_configs` |
| Table `agent_run_logs` | `scout_run_logs` |
| Column `agent_config` | `scout_config` |
| Column `agent_id` (FKs) | `scout_id` |
| Column `agent_run_id` | `scout_run_id` |
| Class `LocalAgentConfig` | `LocalScoutConfig` |
| Class `CloudAgentConfig` | `CloudScoutConfig` |
| Class `AgentRunLog` | `ScoutRunLog` |
#### New (Phase 2)
```sql
CREATE TABLE scout_triage_queue (
id uuid PRIMARY KEY,
user_id uuid NOT NULL REFERENCES users(id),
scout_id uuid NOT NULL REFERENCES cloud_scout_configs(id),
source_type text NOT NULL, -- "gmail"
source_msg_ref text NOT NULL, -- gmail message id
triage_verdict text NOT NULL, -- "relevant"
triage_reason text, -- short LLM reason for debug
status text NOT NULL DEFAULT 'queued', -- queued | delivered | acked | expired
triaged_at timestamptz NOT NULL DEFAULT now(),
delivered_at timestamptz,
acked_at timestamptz,
expires_at timestamptz NOT NULL, -- triaged_at + 30d
UNIQUE (scout_id, source_msg_ref) -- idempotent webhook retries
);
CREATE INDEX ON scout_triage_queue (user_id, status);
CREATE INDEX ON scout_triage_queue (expires_at) WHERE status != 'acked';
```
#### Alterations to `cloud_scout_configs` (Phase 2)
```sql
ALTER TABLE cloud_scout_configs ADD COLUMN auto_trash_spam boolean NOT NULL DEFAULT false;
ALTER TABLE cloud_scout_configs ADD COLUMN gmail_history_id text;
ALTER TABLE cloud_scout_configs ADD COLUMN gmail_watch_expires_at timestamptz;
ALTER TABLE cloud_scout_configs ADD COLUMN device_inactivity_pause_days int NOT NULL DEFAULT 14;
```
OAuth tokens continue to live in the existing `cloud_scout_configs.oauth_token_encrypted` column. Encryption mechanism (key derivation, rotation) is reused unchanged. A pre-implementation investigation step will document the current key-management story so we know the threat model; hardening, if needed, is out of scope.
### SQLite (Electron, Drizzle)
#### Renames (Phase 1)
| Before | After |
|---------------------|---------------------|
| `agent_runs` | `scout_runs` |
| `agent_run_actions` | `scout_run_actions` |
| Col `agent_id` | `scout_id` |
#### New (Phase 2)
```typescript
export const scoutSuggestions = sqliteTable('scout_suggestions', {
id: text().primaryKey(),
scoutId: text().notNull(),
sourceType: text().notNull(), // "gmail"
sourceMsgRef: text().notNull(),
category: text().notNull(), // "unprocessed" until Phase 4
payload: text(), // JSON, populated by Phase 4
rawSubject: text(), // populated on delivery
rawSnippet: text(), // populated on delivery
status: text().notNull(), // pending | approved | rejected | expired
proposedAt: integer().notNull(), // ms epoch
resolvedAt: integer(),
resolvedEntityType: text(), // "task" | "project" | ... after Phase 4 approval
resolvedEntityId: text(),
});
```
`rawSubject` + `rawSnippet` are stored locally to render the HITL card without re-hitting Gmail every render. Body is still NOT stored — fetched on-demand via a tool call when the user explicitly opens the suggestion.
## WebSocket Frame Contract
Existing `/api/v1/device` channel. Two new frame types.
```typescript
// BE → Electron
{
type: 'scout_proposal',
proposal: {
id: string,
scoutId: string,
sourceType: 'gmail',
sourceMsgRef: string,
rawSubject: string | null,
rawSnippet: string | null,
category: 'unprocessed',
payload: null
}
}
// Electron → BE
{ type: 'scout_proposal_ack', proposalId: string }
```
On WS reconnect, BE's `ScoutEngine.deliver_pending(user_id, ws)` selects all `status='queued'` rows for the user, calls `connector.fetch_metadata` per row (subject + snippet only), sends one `scout_proposal` frame each, and flips `status='delivered'` + sets `delivered_at` upon ack.
## Stage 1 Triage Detail
```
Webhook (Pub/Sub) or cron tick
-> ScoutEngine.trigger_scout(scout_id)
-> if device inactive > N days: skip (pause)
-> connector.list_new(scout) -> [ItemRef]
-> for each ref:
- if (scout_id, source_msg_ref) already in queue: skip (idempotent)
- content = await connector.fetch_content(scout, ref) # transient
- verdict = await ScoutEngine._triage_llm(scout, content) # gpt-4o-mini
- if verdict == spam:
- if scout.auto_trash_spam: connector.archive(...)
- return # not queued
- INSERT scout_triage_queue row
-> UPDATE cloud_scout_configs.last_run_at
-> INSERT scout_run_logs row
```
### Triage LLM contract
- **Prompt name (Langfuse):** `scout-triage-system` — source-agnostic, parameterized by `source_type`.
- **Input:** `{source_type, scout_name, scout_purpose, item_subject, item_sender, item_body_truncated_2k}`.
- **Output (structured, Pydantic `TriageVerdict`):** `{verdict: "relevant" | "spam", reason: str, confidence: float}`.
- **Cost guard:** body truncated at 2k chars before LLM call.
### Failure modes
- LLM call fails: log error, leave message unprocessed, retry on next webhook/cron.
- Gmail 401 (refresh exhausted): mark scout `enabled=false`, surface re-auth prompt to user via WS frame on next device connect.
- Pub/Sub webhook unverified JWT: 401.
## Gmail Push Setup
- On scout enable: `GmailConnector.setup_watch(scout)` calls `users.watch` against a single project-wide Pub/Sub topic.
- `gmail_watch_expires_at` stored. Watches expire after 7 days.
- Weekly cron `_scout_watch_renewal_tick` re-issues `watch` for any scout whose expiry is within 24h.
- Webhook route: `POST /api/v1/scouts/webhooks/gmail`. Verifies Pub/Sub-signed JWT, resolves user via the email address in the payload, enqueues triage job.
- Cron fallback (`_scout_cron_tick`, runs each scout's `schedule_cron`): polls `users.history.list` since `gmail_history_id`, updates `gmail_history_id` after.
## Terminology Refactor (Detail)
### Renamed
| Surface | Before | After |
|-------------------|-----------------------------------------------------|-----------------------------------------------------|
| Settings nav | `settings.agents` "Agents" | `settings.scouts` "Scouts" |
| Subtitle/desc | `settings.agentsSubtitle`, `agentsDescription` | `settings.scoutsSubtitle`, `scoutsDescription` |
| `agents.*` keys | `noAgentsYet`, `createAgent`, `yourAgents`, etc. | `scouts.noScoutsYet`, `createScout`, `yourScouts` |
| `toast.agent.*` | `created`, `runStarted`, etc. | `toast.scout.*` |
| Components | `AgentsSection`, `AgentRow`, `LocalAgentConfigPanel`, `CloudAgentConfigPanel`, `InlineAgentCreationStepper` | `ScoutsSection`, `ScoutRow`, `LocalScoutConfigPanel`, `CloudScoutConfigPanel`, `InlineScoutCreationStepper` |
| TS types | `LocalAgentConfig`, `CloudAgentConfig` | `LocalScoutConfig`, `CloudScoutConfig` |
| tRPC router | `agent.local`, `agent.cloud`, `agent.journey`, `agent.runs`, `agent.runActions` | `scout.local`, `scout.cloud`, `scout.journey`, `scout.runs`, `scout.runActions` |
| Drizzle tables | `agent_runs`, `agent_run_actions` | `scout_runs`, `scout_run_actions` |
| Main process | `src/main/agents/agent-scheduler.ts` | `src/main/scouts/scout-scheduler.ts` |
| BE routes | `/api/v1/agents/*`, `/api/v1/agent-setup` | `/api/v1/scouts/*`, `/api/v1/scout-setup` |
| BE modules | `routes/agents.py`, `routes/agent_setup.py`, `core/agent_runner.py`, `core/agent_session_buffer.py`, `core/agent_registry.py` | `routes/scouts.py`, `routes/scout_setup.py`, `core/scout_runner.py`, `core/scout_session_buffer.py`, `core/scout_registry.py` |
| Postgres tables | `local_agent_configs`, `cloud_agent_configs`, `agent_run_logs` | `local_scout_configs`, `cloud_scout_configs`, `scout_run_logs` |
| Postgres columns | `agent_config`, `agent_id`, `agent_run_id` | `scout_config`, `scout_id`, `scout_run_id` |
| SQLAlchemy models | `LocalAgentConfig`, `CloudAgentConfig`, `AgentRunLog` | `LocalScoutConfig`, `CloudScoutConfig`, `ScoutRunLog` |
| Langfuse prompts | user-facing scout prompts named `agent-*` | recreate as `scout-*`; delete old |
| i18n | 5 langs (en/it/es/fr/de) | all updated atomically |
### Kept as-is
- `app/agents/*` Python module — these are LLM helper agents (task_agent, project_agent, note_agent, timeline_agent, filesystem_agent) invoked internally by `deep_agent`. Different concept from user-facing scouts. Renaming would create semantic clash with LLM-agent terminology.
- `/api/v1/device` WS endpoint name (already source-neutral).
- All `tool_call`, `run_complete`, etc. WS frame types unrelated to scouts.
## Phasing
### Phase 1 — Rename only
- Single PR. Single Alembic migration. Single Drizzle migration.
- All renames listed above land together. App still works, existing local scout still runs. No new behavior.
### Phase 2 — Connector abstraction skeleton
- New module `app/scouts/connectors/{base,registry,gmail}.py`.
- New module `app/scouts/engine.py`.
- New table `scout_triage_queue` + alterations to `cloud_scout_configs`.
- New SQLite table `scout_suggestions` (Drizzle).
- New WS frame types `scout_proposal` + `scout_proposal_ack`.
- No user-facing change yet.
### Phase 3 — Gmail scout end-to-end
- Settings UI: "Add Gmail scout" → OAuth consent (separate scope set: `gmail.readonly` + `gmail.modify`) → encrypted token stored in `cloud_scout_configs.oauth_token_encrypted` → save scout config.
- Pub/Sub topic + webhook route + JWT verify.
- `setup_watch` on enable; weekly `renew_watch` cron.
- Cron-fallback `_scout_cron_tick` per scout.
- Triage LLM (gpt-4o-mini, Langfuse `scout-triage-system`).
- Spam auto-trash toggle (default off) per scout.
- Device-inactivity pause logic.
- WS deliver-on-reconnect drains queue → `scout_proposal` frames → ack handler → SQLite `scout_suggestions` insert with `category='unprocessed'` (Phase 4 swaps real categorization in).
- "Read full email" tool call: Electron requests body for a suggestion → BE `GmailConnector.fetch_content` → returns body transiently in tool result.
### Phase 4 — Deferred (separate spec, with task-brief rework)
- Stage 2 categorization agent (prompt + tool palette: `list_projects`, `list_tasks`, `search_notes`, memory).
- HITL UI surface in the brief: suggestion cards, approve/reject controls, "convert to task | event | note | project | actionable-only" actions.
- `list_pending_scout_suggestions` brief tool.
- Convert-to-entity mutations.
- Future connectors (Slack/Teams/Outlook/...).
## Testing Surface
- **Phase 1:** existing pytest suite still green with renamed identifiers (auth, ws_unified, schemas, models, etc.). UI smoke: settings page renders, existing local scout runs.
- **Phase 2:** unit tests for `ScoutEngine` w/ mocked `SourceConnector`. Idempotency test (replay same `source_msg_ref`).
- **Phase 3:** integration tests for Gmail webhook → triage → queue insertion (mocked `GmailConnector` for content fetch and LLM). E2E (manual): connect a real Gmail account on dev, send an email, observe queue row appear, reconnect device, observe `scout_suggestions` row land with subject/snippet.
## Open Questions (none blocking)
- OAuth-token encryption key derivation (app-global vs user-derived) — investigation step in implementation plan; document current state, security hardening is out of scope.
- Pub/Sub topic naming and IAM setup (one topic project-wide vs per-environment) — operational detail to decide during Phase 3.
## Risks
- Pub/Sub setup is per-Google-Cloud-project and requires console IAM grants — first-time setup friction.
- Gmail `users.watch` quota: 1 watch per user. We use one watch per scout, but a user has only one Gmail scout per Gmail account, so this is fine.
- `_pending_states` dict pattern in existing OAuth flow is in-memory — Pub/Sub webhook can run on any worker, so any cross-request state must be in DB, not in-memory. This design uses no in-memory state; safe.
## Acceptance
- All renames land atomically; app boots; existing local scout still operates.
- A user can connect Gmail through the Scouts settings page, see the scout marked enabled, send themselves a test email, and observe a `scout_suggestions` row appear in their local DB with `category='unprocessed'`, `rawSubject`, and `rawSnippet` populated, after the next WS reconnect.
- Spam emails (per LLM triage) are not queued; if `auto_trash_spam=true` they appear in Gmail Trash.
- BE never persists email bodies. Verified by code review of triage flow + grep for `body_text` writes.

File diff suppressed because it is too large Load Diff