Branches scout creation: Gmail gets slim flow (name + focus + auto-trash + label/sender filter, OAuth during creation) fitting two-stage HITL pipeline. Local-directory flow untouched. Config panel rewritten for edit parity. Adds gmail_address column, label-list + disconnect routes, serializer oauthConnected/filterConfig fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
12 KiB
Cloud Scout Creation Flow (Gmail) — Design
Date: 2026-05-16 Status: Draft, awaiting user review Owner: Roberto Predecessor: 2026-05-15-scouts-refactor-and-gmail-integration-design.md (Phases 1–3 shipped)
Summary
The scout creation stepper (InlineScoutCreationStepper) and the cloud config panel (CloudScoutConfigPanel) still expose the pre-refactor local-agent config shape — a data-type picker, a batch-interval select, and a user-authored extraction-prompt builder — for all scouts, including Gmail. These fields contradict the two-stage HITL pipeline shipped in Phases 1–3:
- Categorization is automatic and deferred to Phase 4. The scout categorizes every relevant email itself (task / event / note / project) and proposes via the brief. Users do not pick extraction types.
- The triage prompt is server-side and IP-protected (Langfuse
scout-triage-system, zero-trust). Users must not author it. - Gmail is push-primary (Pub/Sub
watch). The cron schedule is a fallback only; surfacing it as a "batch interval" implies emails arrive every N hours, which is false.
This design branches the creation flow so cloud scouts (Gmail) get a slim flow fitting the new pipeline — name, focus text, spam auto-trash, and a label/sender filter, with OAuth performed during creation — while local-directory scouts keep their current full flow untouched. The cloud config panel is rewritten to match (full edit parity).
Goals
- Cloud (Gmail) creation collects only fields the new pipeline actually uses.
- OAuth happens during creation; the scout is live immediately on connect.
- A label + sender filter lets the user scope which emails the scout watches.
- A free-text "focus" field steers triage (
scout_purpose) without exposing the prompt. - The cloud config panel offers full edit parity (focus, filter, auto-trash, connection management).
- Fix the Phase-3 follow-up: the BE cloud serializer now returns
oauthConnected,filterConfig, andgmail_address.
Non-Goals
- Stage-2 categorization agent and the brief HITL surface (Phase 4, separate spec).
- Teams / Outlook slim flows. They share the cloud branch path but their connectors don't exist yet; their catalog cards are disabled with a "coming soon" marker.
- Changes to the local-directory creation flow (untouched).
date_rangefilter (a watch is ongoing, not time-bounded).- New pending-token OAuth machinery — we reuse the existing scout-id-bound OAuth by creating the scout at the connect step.
Constraints
- Pre-1.0 dev — no production users, no migration shims beyond what Alembic needs.
- Zero-trust + IP-protected triage prompt preserved — the focus field maps to
scout_purpose, never to the raw prompt. - Reuse existing OAuth (
startGmailOAuth/completeGmailOAuth+ deep-link callback) and existingGmailClient._build_gmail_query(already readslabels+senders).
Architecture
Stepper branch
InlineScoutCreationStepper becomes a thin router. The template-pick step (Step 1) stays shared. After a template is chosen, the stepper delegates by selectedTemplate.type:
local_directory→LocalScoutCreationFlow(the current 3-step body, extracted verbatim).- cloud (Gmail) →
CloudScoutCreationFlow(new).
This extraction keeps each flow in its own focused component rather than piling if (cloud) branches through the existing local logic.
Cloud (Gmail) flow
Step 1 (shared): Choose template → Gmail
Step 2 — Connect & Basics:
- Name (required)
- Focus text (optional) → prompt_template → scout_purpose
- Auto-trash spam toggle (off) → auto_trash_spam
- [Connect Gmail] button:
1. scout.cloud.create({ name, provider:'gmail', dataTypes:[],
promptTemplate, autoTrashSpam, filterConfig:{} })
→ returns scout id (dormant, no token yet)
2. startGmailOAuth({ scoutId }) → browser consent
3. deep-link callback → completeGmailOAuth({ code, state })
→ token stored, setup_watch fires, gmail_address persisted → scout live
Step 3 — Filter (post-connect):
- scout.cloud.gmailLabels({ scoutId }) → populate label multi-select
- Labels multi-select + sender/domain allowlist chips
- [Save] → scout.cloud.update({ id, filterConfig:{ labels, senders } })
- [Skip] → leaves filter empty (watch all INBOX)
- [Finish] → closes stepper, invalidates scout lists
Why create at the connect step: the existing BE OAuth flow binds the token to an existing scout_id (/scouts/oauth/gmail/authorize?scout_id=…). Creating the dormant scout at connect reuses that flow with zero new machinery. The filter is applied as an update after connect — within the same stepper, it reads as one continuous flow.
Abandon handling
- Bail before connect → dormant unconnected scout row remains; its row shows the same "Connect Gmail" CTA (identical to today's post-create connect path).
- Bail after connect, before filter → live INBOX-wide scout (empty filter). Functional; editable later in the config panel.
Both are acceptable pre-1.0; neither leaves corrupt state.
Fields & Data Mapping
| UI field | Step | Stored as | Notes |
|---|---|---|---|
| Name | 2 | name |
required |
| Focus text | 2 | prompt_template |
free-text → scout_purpose in triage; optional |
| Auto-trash spam | 2 | auto_trash_spam |
toggle, default off |
| Labels | 3 | filter_config.labels: string[] |
multi-select of fetched Gmail labels; empty = all INBOX |
| Senders | 3 | filter_config.senders: string[] |
chips (alice@x.com or @client.co); optional |
Dropped from cloud: data-types picker (dataTypes sent as []), batch interval (scheduleCron omitted → BE default), extraction-prompt builder (PromptBuilderChat not rendered for cloud).
filter_config = { labels?: string[], senders?: string[] } — matches the existing GmailClient._build_gmail_query, so no query-builder change is needed; we just populate the config from the UI instead of sending {}.
Backend Changes
1. Gmail label listing
GmailConnector.list_labels(scout) -> list[dict]— callsusers().labels().list(), returns[{id, name}](user + system labels), wrapped inasyncio.to_thread. Returns[]if no token.- Route
GET /api/v1/scouts/cloud/{scout_id}/gmail-labels— auth-guarded, loads scout (ownership check), calls connector. - tRPC
scout.cloud.gmailLabels({ scoutId })→proxyGet.
2. Gmail disconnect / stop watch
GmailConnector.stop_watch(scout)— callsusers().stop(), swallows errors (watch may already be expired).- Route
POST /api/v1/scouts/cloud/{scout_id}/gmail-disconnect— clearsoauth_token_encrypted, nullsgmail_history_id+gmail_watch_expires_at+gmail_address, setsenabled=false, callsstop_watch. - tRPC
scout.cloud.disconnectGmail({ scoutId }).
3. scout.cloud.create input — loosen + extend
scheduleCron: required → optional (BE applies its default when omitted).dataTypes: stays; cloud sends[].- Add
autoTrashSpam?: boolean(defaultfalse). promptTemplate: already present (carries focus text).filterConfig: already present (now populated).- BE
POST /scouts/cloudmust accept + persistauto_trash_spam.
4. scout.cloud.update input — extend for config-panel parity
- Add
autoTrashSpam?,promptTemplate?,filterConfig?(all optional, partial update). - BE
PUT/PATCH /scouts/cloud/{id}must apply these columns.
5. Cloud serializer — return the new fields
The BE cloud list/get serializer must return:
auto_trash_spamfilter_configprompt_templategmail_address- computed
oauthConnected = oauth_token_encrypted is not None
(oauthConnected was added to the TS type in Phase 3 but never populated by the BE — fixed here.)
6. gmail_address column — Alembic 009
ALTER TABLE cloud_scout_configs ADD COLUMN gmail_address VARCHAR(320) NULL.- Populated on OAuth callback from the Gmail profile (
users().getProfile().emailAddressor OIDCuserinfoemail). - SQLAlchemy model field
gmail_address: Mapped[str | None].
Shared TS type CloudScoutConfig
Add: autoTrashSpam: boolean, filterConfig?: { labels?: string[]; senders?: string[] }, promptTemplate?: string, gmailAddress?: string | null. (oauthConnected already present.)
Config Panel Parity (CloudScoutConfigPanel)
Rewrite the expanded edit view to the slim model:
- Connection status block:
- Not connected → amber "Connect Gmail" CTA (existing
startGmailOAuth). - Connected → "Connected as
<gmailAddress>" + "Reconnect" (re-runstartGmailOAuth) + "Disconnect" (disconnectGmail).
- Not connected → amber "Connect Gmail" CTA (existing
- Focus text — editable textarea bound to
prompt_template. - Filter — label multi-select (via
scout.cloud.gmailLabels) + sender chips, bound tofilter_config. - Auto-trash spam — toggle bound to
auto_trash_spam. - Save changes — single
scout.cloud.update({ id, promptTemplate, filterConfig, autoTrashSpam }).
Removed: data-types checkboxes, schedule select, "Customize AI prompt" journey button.
Catalog Gating (Teams / Outlook)
The catalog currently shows Local Directory, Gmail, Teams, Outlook cards. The cloud branch only implements Gmail. Teams and Outlook cards are rendered disabled with a "coming soon" marker until their connectors exist, preventing a user from entering a half-built cloud flow for an unimplemented provider.
i18n
New keys in all 5 languages (en/it/es/fr/de):
scouts.focusLabel, scouts.focusPlaceholder, scouts.autoTrashSpam, scouts.autoTrashHint, scouts.filterLabels, scouts.filterSenders, scouts.filterSendersPlaceholder, scouts.watchAllInbox, scouts.connectedAs, scouts.reconnect, scouts.disconnect, scouts.skipFilter, scouts.finish, plus the cloud stepper step headers (currently hardcoded English in the stepper — extracted to keys during the branch).
Testing
- BE unit:
GmailConnector.list_labels+stop_watch(mocked Gmail service).scout.cloud.createwith omittedscheduleCronapplies the default and persistsauto_trash_spam. Cloud serializer returnsoauthConnected+filterConfig+gmail_address. - BE migration: Alembic 009 revision-graph check (head = 009, parent = 008).
- Electron: no test suite —
tsc --noEmit+ manual smoke: create a Gmail scout end-to-end, connect, pick labels, confirm acloud_scout_configsrow with the focus/filter/auto-trash values and a populatedgmail_address.
Acceptance
- Creating a Gmail scout shows only name + focus + auto-trash, then a Connect step, then a label/sender filter step — no data-type picker, no batch interval, no extraction-prompt builder.
- After connect, the scout row shows "Connected as
<email>". - The config panel edits focus, filter, and auto-trash, and can disconnect/reconnect.
- Local-directory scout creation is unchanged.
- Teams/Outlook cards are visibly disabled.
- BE cloud list returns
oauthConnected,filterConfig,gmail_address.
Open Questions
None blocking.
Risks
- Label fetch latency:
users().labels().list()is one extra round-trip after OAuth. Acceptable; show a loading state on the multi-select. - Dormant-scout litter: abandoned flows leave dormant/unfiltered scouts. Pre-1.0 acceptable; a future cleanup job could prune never-connected scouts older than N days.
gmail_addressPII: stored plaintext (it's the user's own address, already in their JWT identity). Not sensitive beyond existing storage.