# Cloud Scout Creation Flow (Gmail) — Design **Date:** 2026-05-16 **Status:** Draft, awaiting user review **Owner:** Roberto **Predecessor:** [2026-05-15-scouts-refactor-and-gmail-integration-design.md](2026-05-15-scouts-refactor-and-gmail-integration-design.md) (Phases 1–3 shipped) ## Summary The scout creation stepper (`InlineScoutCreationStepper`) and the cloud config panel (`CloudScoutConfigPanel`) still expose the pre-refactor local-agent config shape — a data-type picker, a batch-interval select, and a user-authored extraction-prompt builder — for **all** scouts, including Gmail. These fields contradict the two-stage HITL pipeline shipped in Phases 1–3: - **Categorization is automatic and deferred to Phase 4.** The scout categorizes every relevant email itself (task / event / note / project) and proposes via the brief. Users do not pick extraction types. - **The triage prompt is server-side and IP-protected** (Langfuse `scout-triage-system`, zero-trust). Users must not author it. - **Gmail is push-primary** (Pub/Sub `watch`). The cron schedule is a fallback only; surfacing it as a "batch interval" implies emails arrive every N hours, which is false. This design branches the creation flow so cloud scouts (Gmail) get a slim flow fitting the new pipeline — name, focus text, spam auto-trash, and a label/sender filter, with OAuth performed during creation — while local-directory scouts keep their current full flow untouched. The cloud config panel is rewritten to match (full edit parity). ## Goals - Cloud (Gmail) creation collects only fields the new pipeline actually uses. - OAuth happens during creation; the scout is live immediately on connect. - A label + sender filter lets the user scope which emails the scout watches. - A free-text "focus" field steers triage (`scout_purpose`) without exposing the prompt. - The cloud config panel offers full edit parity (focus, filter, auto-trash, connection management). - Fix the Phase-3 follow-up: the BE cloud serializer now returns `oauthConnected`, `filterConfig`, and `gmail_address`. ## Non-Goals - Stage-2 categorization agent and the brief HITL surface (Phase 4, separate spec). - Teams / Outlook slim flows. They share the cloud branch path but their connectors don't exist yet; their catalog cards are disabled with a "coming soon" marker. - Changes to the local-directory creation flow (untouched). - `date_range` filter (a watch is ongoing, not time-bounded). - New pending-token OAuth machinery — we reuse the existing scout-id-bound OAuth by creating the scout at the connect step. ## Constraints - Pre-1.0 dev — no production users, no migration shims beyond what Alembic needs. - Zero-trust + IP-protected triage prompt preserved — the focus field maps to `scout_purpose`, never to the raw prompt. - Reuse existing OAuth (`startGmailOAuth` / `completeGmailOAuth` + deep-link callback) and existing `GmailClient._build_gmail_query` (already reads `labels` + `senders`). ## Architecture ### Stepper branch `InlineScoutCreationStepper` becomes a thin router. The template-pick step (Step 1) stays shared. After a template is chosen, the stepper delegates by `selectedTemplate.type`: - `local_directory` → `LocalScoutCreationFlow` (the current 3-step body, extracted verbatim). - cloud (Gmail) → `CloudScoutCreationFlow` (new). This extraction keeps each flow in its own focused component rather than piling `if (cloud)` branches through the existing local logic. ### Cloud (Gmail) flow ``` Step 1 (shared): Choose template → Gmail Step 2 — Connect & Basics: - Name (required) - Focus text (optional) → prompt_template → scout_purpose - Auto-trash spam toggle (off) → auto_trash_spam - [Connect Gmail] button: 1. scout.cloud.create({ name, provider:'gmail', dataTypes:[], promptTemplate, autoTrashSpam, filterConfig:{} }) → returns scout id (dormant, no token yet) 2. startGmailOAuth({ scoutId }) → browser consent 3. deep-link callback → completeGmailOAuth({ code, state }) → token stored, setup_watch fires, gmail_address persisted → scout live Step 3 — Filter (post-connect): - scout.cloud.gmailLabels({ scoutId }) → populate label multi-select - Labels multi-select + sender/domain allowlist chips - [Save] → scout.cloud.update({ id, filterConfig:{ labels, senders } }) - [Skip] → leaves filter empty (watch all INBOX) - [Finish] → closes stepper, invalidates scout lists ``` **Why create at the connect step:** the existing BE OAuth flow binds the token to an existing `scout_id` (`/scouts/oauth/gmail/authorize?scout_id=…`). Creating the dormant scout at connect reuses that flow with zero new machinery. The filter is applied as an `update` after connect — within the same stepper, it reads as one continuous flow. ### Abandon handling - Bail **before** connect → dormant unconnected scout row remains; its row shows the same "Connect Gmail" CTA (identical to today's post-create connect path). - Bail **after** connect, before filter → live INBOX-wide scout (empty filter). Functional; editable later in the config panel. Both are acceptable pre-1.0; neither leaves corrupt state. ## Fields & Data Mapping | UI field | Step | Stored as | Notes | |----------|------|-----------|-------| | Name | 2 | `name` | required | | Focus text | 2 | `prompt_template` | free-text → `scout_purpose` in triage; optional | | Auto-trash spam | 2 | `auto_trash_spam` | toggle, default **off** | | Labels | 3 | `filter_config.labels: string[]` | multi-select of fetched Gmail labels; empty = all INBOX | | Senders | 3 | `filter_config.senders: string[]` | chips (`alice@x.com` or `@client.co`); optional | **Dropped from cloud:** data-types picker (`dataTypes` sent as `[]`), batch interval (`scheduleCron` omitted → BE default), extraction-prompt builder (`PromptBuilderChat` not rendered for cloud). `filter_config = { labels?: string[], senders?: string[] }` — matches the existing `GmailClient._build_gmail_query`, so no query-builder change is needed; we just populate the config from the UI instead of sending `{}`. ## Backend Changes ### 1. Gmail label listing - `GmailConnector.list_labels(scout) -> list[dict]` — calls `users().labels().list()`, returns `[{id, name}]` (user + system labels), wrapped in `asyncio.to_thread`. Returns `[]` if no token. - Route `GET /api/v1/scouts/cloud/{scout_id}/gmail-labels` — auth-guarded, loads scout (ownership check), calls connector. - tRPC `scout.cloud.gmailLabels({ scoutId })` → `proxyGet`. ### 2. Gmail disconnect / stop watch - `GmailConnector.stop_watch(scout)` — calls `users().stop()`, swallows errors (watch may already be expired). - Route `POST /api/v1/scouts/cloud/{scout_id}/gmail-disconnect` — clears `oauth_token_encrypted`, nulls `gmail_history_id` + `gmail_watch_expires_at` + `gmail_address`, sets `enabled=false`, calls `stop_watch`. - tRPC `scout.cloud.disconnectGmail({ scoutId })`. ### 3. `scout.cloud.create` input — loosen + extend - `scheduleCron`: required → **optional** (BE applies its default when omitted). - `dataTypes`: stays; cloud sends `[]`. - Add `autoTrashSpam?: boolean` (default `false`). - `promptTemplate`: already present (carries focus text). - `filterConfig`: already present (now populated). - BE `POST /scouts/cloud` must accept + persist `auto_trash_spam`. ### 4. `scout.cloud.update` input — extend for config-panel parity - Add `autoTrashSpam?`, `promptTemplate?`, `filterConfig?` (all optional, partial update). - BE `PUT/PATCH /scouts/cloud/{id}` must apply these columns. ### 5. Cloud serializer — return the new fields The BE cloud list/get serializer must return: - `auto_trash_spam` - `filter_config` - `prompt_template` - `gmail_address` - computed `oauthConnected = oauth_token_encrypted is not None` (`oauthConnected` was added to the TS type in Phase 3 but never populated by the BE — fixed here.) ### 6. `gmail_address` column — Alembic 009 - `ALTER TABLE cloud_scout_configs ADD COLUMN gmail_address VARCHAR(320) NULL`. - Populated on OAuth callback from the Gmail profile (`users().getProfile().emailAddress` or OIDC `userinfo` email). - SQLAlchemy model field `gmail_address: Mapped[str | None]`. ### Shared TS type `CloudScoutConfig` Add: `autoTrashSpam: boolean`, `filterConfig?: { labels?: string[]; senders?: string[] }`, `promptTemplate?: string`, `gmailAddress?: string | null`. (`oauthConnected` already present.) ## Config Panel Parity (`CloudScoutConfigPanel`) Rewrite the expanded edit view to the slim model: - **Connection status block:** - Not connected → amber "Connect Gmail" CTA (existing `startGmailOAuth`). - Connected → "Connected as ``" + "Reconnect" (re-run `startGmailOAuth`) + "Disconnect" (`disconnectGmail`). - **Focus text** — editable textarea bound to `prompt_template`. - **Filter** — label multi-select (via `scout.cloud.gmailLabels`) + sender chips, bound to `filter_config`. - **Auto-trash spam** — toggle bound to `auto_trash_spam`. - **Save changes** — single `scout.cloud.update({ id, promptTemplate, filterConfig, autoTrashSpam })`. **Removed:** data-types checkboxes, schedule select, "Customize AI prompt" journey button. ## Catalog Gating (Teams / Outlook) The catalog currently shows Local Directory, Gmail, Teams, Outlook cards. The cloud branch only implements Gmail. Teams and Outlook cards are rendered **disabled** with a "coming soon" marker until their connectors exist, preventing a user from entering a half-built cloud flow for an unimplemented provider. ## i18n New keys in all 5 languages (`en/it/es/fr/de`): `scouts.focusLabel`, `scouts.focusPlaceholder`, `scouts.autoTrashSpam`, `scouts.autoTrashHint`, `scouts.filterLabels`, `scouts.filterSenders`, `scouts.filterSendersPlaceholder`, `scouts.watchAllInbox`, `scouts.connectedAs`, `scouts.reconnect`, `scouts.disconnect`, `scouts.skipFilter`, `scouts.finish`, plus the cloud stepper step headers (currently hardcoded English in the stepper — extracted to keys during the branch). ## Testing - **BE unit:** `GmailConnector.list_labels` + `stop_watch` (mocked Gmail service). `scout.cloud.create` with omitted `scheduleCron` applies the default and persists `auto_trash_spam`. Cloud serializer returns `oauthConnected` + `filterConfig` + `gmail_address`. - **BE migration:** Alembic 009 revision-graph check (head = 009, parent = 008). - **Electron:** no test suite — `tsc --noEmit` + manual smoke: create a Gmail scout end-to-end, connect, pick labels, confirm a `cloud_scout_configs` row with the focus/filter/auto-trash values and a populated `gmail_address`. ## Acceptance - Creating a Gmail scout shows only name + focus + auto-trash, then a Connect step, then a label/sender filter step — no data-type picker, no batch interval, no extraction-prompt builder. - After connect, the scout row shows "Connected as ``". - The config panel edits focus, filter, and auto-trash, and can disconnect/reconnect. - Local-directory scout creation is unchanged. - Teams/Outlook cards are visibly disabled. - BE cloud list returns `oauthConnected`, `filterConfig`, `gmail_address`. ## Open Questions None blocking. ## Risks - **Label fetch latency:** `users().labels().list()` is one extra round-trip after OAuth. Acceptable; show a loading state on the multi-select. - **Dormant-scout litter:** abandoned flows leave dormant/unfiltered scouts. Pre-1.0 acceptable; a future cleanup job could prune never-connected scouts older than N days. - **`gmail_address` PII:** stored plaintext (it's the user's own address, already in their JWT identity). Not sensitive beyond existing storage.