step 3.1 complete: agent config tables + schemas + migration

2026-03-05 15:14:43 +01:00
parent c6e1e4e7fd
commit 1dfd088e18
4 changed files with 634 additions and 0 deletions
--- a/AI_REFACTOR_PLAN.md
+++ b/AI_REFACTOR_PLAN.md
@@ -240,4 +240,262 @@ Tools must use **camelCase** field names (Drizzle maps them to snake_case intern
 - **Step 2.1 is the point of no return** — after removing LangChain, there's no local AI fallback.
 - **Phase B (backend changes) must land before Phase 1.3–1.5** — Electron needs the bidirectional WS to talk to.
 - **Phase 3 and Phase 4 are independent** — can be parallelized after Phase 2.
+
+---
+
+## Phase 3 — Agent System: Config, Orchestration & Cloud Connectors
+
+> **Objective:** Backend manages all agent configuration, scheduling, orchestration, and cloud data fetching. Two agent types: **Local Directory Agent** (backend triggers Electron to read files, then AI analyzes) and **Cloud Connector Agent** (backend fetches Gmail/Teams data directly, AI analyzes, pushes results to Electron via WS tool_call). All extracted items use existing WS tool infrastructure to insert into Electron's local DB with `is_ai_suggested=True`.
+>
+> **Electron Phase 3 plan:** `../adiuva/AI_REFACTOR_PLAN.md` Phase 3 section.
+
+### Architecture
+
+```
+Local Agent:
+  Scheduler/manual trigger ──► check device online ──► WS agent_run → Electron
+    ──► Electron reads files ──► WS agent_data → Backend
+    ──► Backend AI (prompt_template + file content) ──► WS tool_call(insert) → Electron
+    ──► Electron persists with isAiSuggested=1
+
+Cloud Agent:
+  Scheduler/manual trigger ──► Backend fetches Gmail/Teams (OAuth) ──► Backend AI analyzes
+    ──► check device online ──► WS tool_call(insert) → Electron ──► Electron persists
+```
+
+**New WS frame types:**
+
+| Direction | `type` | Payload |
+|---|---|---|
+| Server → Client | `agent_run` | `{ run_id, agent_id, config: { paths, file_extensions, prompt_template, data_types } }` |
+| Client → Server | `agent_data` | `{ run_id, files: [{ path, name, content, metadata }] }` |
+| Client → Server | `agent_complete` | `{ run_id, files_read, errors }` |
+| Client → Server | `device_hello` | `{ device_id, agent_ids }` |
+
+### Step 3.1 — Agent config tables
+- [x] Add to `app/models.py`:
+  - **`LocalAgentConfig`**:
+    - `id` UUID PK
+    - `user_id` FK → users
+    - `device_id` str — identifies which Electron install this config belongs to
+    - `name` str
+    - `directory_paths` JSON — list of absolute paths on the device
+    - `data_types` JSON — which tables to extract to: `["tasks", "notes", "checkpoints", "projects"]`
+    - `prompt_template` text — user-configured via Chatbot Journey
+    - `file_extensions` JSON — e.g. `[".eml", ".txt", ".pdf", ".md"]`
+    - `schedule_cron` str — e.g. `"0 */6 * * *"` (every 6h)
+    - `enabled` bool (default True)
+    - `last_run_at` datetime nullable
+    - `created_at`, `updated_at` timestamps
+  - **`CloudAgentConfig`**:
+    - `id` UUID PK
+    - `user_id` FK → users
+    - `provider` str — enum: `gmail`, `teams`, `outlook`
+    - `name` str
+    - `data_types` JSON — same format as local
+    - `prompt_template` text
+    - `oauth_token_encrypted` text — Fernet-encrypted OAuth2 credentials
+    - `schedule_cron` str
+    - `enabled` bool (default True)
+    - `last_run_at` datetime nullable
+    - `filter_config` JSON — provider-specific: `{ labels: [], date_range: {from, to}, senders: [] }`
+    - `created_at`, `updated_at` timestamps
+  - **`AgentRunLog`**:
+    - `id` UUID PK
+    - `agent_id` str — references LocalAgentConfig.id or CloudAgentConfig.id
+    - `agent_type` str — `local` or `cloud`
+    - `user_id` FK → users
+    - `status` str — `running`, `success`, `error`, `partial`
+    - `items_processed` int (default 0)
+    - `items_created` int (default 0)
+    - `errors` JSON — list of error strings
+    - `started_at` datetime
+    - `completed_at` datetime nullable
+- [x] Add Pydantic schemas to `app/schemas.py`:
+  - `LocalAgentConfigCreate`, `LocalAgentConfigUpdate`, `LocalAgentConfigResponse`
+  - `CloudAgentConfigCreate`, `CloudAgentConfigUpdate`, `CloudAgentConfigResponse`
+  - `AgentRunLogResponse`
+  - `AgentCatalogItem` — `{ type, name, description, config_schema }`
+  - `WsAgentRun`, `WsAgentData`, `WsAgentComplete`, `WsDeviceHello`
+- [x] Generate Alembic migration
+- **Files:** `app/models.py`, `app/schemas.py`, `alembic/versions/`
+- **Outcome:** Agent config and run tracking tables in PostgreSQL.
+
+### Step 3.2 — Agent CRUD API routes
+- [ ] Create `app/api/routes/agents.py`:
+  - `GET /api/v1/agents/catalog` — returns hardcoded agent type catalog:
+    - `local_directory`: "Watches local directories, extracts data from files using AI"
+    - `gmail`: "Scans Gmail inbox, extracts tasks/notes from emails"
+    - `teams`: "Monitors Teams messages, extracts action items"
+    - `outlook`: "Scans Outlook inbox, extracts tasks/notes"
+  - `GET /api/v1/agents/local` — list user's local agent configs
+  - `POST /api/v1/agents/local` — create local agent config
+    - Body: `{ name, device_id, directory_paths, data_types, prompt_template, file_extensions, schedule_cron }`
+    - Tier check: count enabled agents ≤ `batch_active` limit
+  - `PUT /api/v1/agents/local/{id}` — update config (ownership check)
+  - `DELETE /api/v1/agents/local/{id}` — delete config + associated run logs
+  - `GET /api/v1/agents/cloud` — list user's cloud agent configs
+  - `POST /api/v1/agents/cloud` — create cloud connector config
+    - Body: `{ provider, name, data_types, prompt_template, oauth_token_encrypted, schedule_cron, filter_config }`
+    - Tier check: same `batch_active` limit (local + cloud count together)
+  - `PUT /api/v1/agents/cloud/{id}` — update config
+  - `DELETE /api/v1/agents/cloud/{id}` — delete config + run logs
+  - `GET /api/v1/agents/runs` — query params: `agent_id`, `page`, `limit` → paginated run logs
+  - `POST /api/v1/agents/{id}/run` — manual trigger (dispatches to agent runner)
+  - All routes require JWT auth; ownership enforced on all mutations
+- [ ] Register router in `app/main.py`
+- **Files:** `app/api/routes/agents.py`, `app/main.py`
+- **Outcome:** Full CRUD for agent configs with tier-gated creation limits.
+
+### Step 3.3 — Device WS endpoint
+- [ ] Create `app/api/routes/device_ws.py`:
+  - `WebSocket /api/v1/ws/device?token=<jwt>` — persistent connection from Electron
+  - On connect:
+    - Authenticate JWT
+    - Receive `device_hello` frame → extract `device_id`, `agent_ids`
+    - Store connection in `DeviceConnectionManager` (in-memory dict: `user_id → { ws, device_id }`)
+    - Check for overdue agent runs → trigger them immediately
+  - Message loop:
+    - `agent_data` → route to active agent run handler
+    - `agent_complete` → finalize agent run
+    - `tool_result` → route to pending tool call (same pattern as chat WS)
+    - `pong` → heartbeat ack
+  - On disconnect:
+    - Remove from `DeviceConnectionManager`
+    - Mark any in-progress agent runs as `error` with "device disconnected"
+  - Heartbeat: send `ping` every 30s, disconnect if no `pong` within 10s
+- [ ] Create `app/core/device_manager.py`:
+  - `DeviceConnectionManager` (singleton):
+    - `register(user_id, device_id, ws)` — stores active connection
+    - `unregister(user_id)` — removes connection
+    - `get_ws(user_id) -> WebSocket | None` — returns active WS if device is online
+    - `is_online(user_id, device_id=None) -> bool` — optionally checks specific device
+    - `send_frame(user_id, frame: dict)` — sends JSON frame to device
+- **Files:** `app/api/routes/device_ws.py`, `app/core/device_manager.py`, `app/main.py`
+- **Outcome:** Backend maintains persistent WS connections to Electron devices for agent triggers.
+
+### Step 3.4 — Agent run orchestrator
+- [ ] Create `app/core/agent_runner.py`:
+  - `async run_local_agent(user_id, config: LocalAgentConfig, device_mgr: DeviceConnectionManager)`:
+    1. Check device is online with matching `device_id` → abort if offline
+    2. Create `AgentRunLog` with `status=running`
+    3. Send `WsAgentRun` frame to Electron with config (paths, extensions, prompt)
+    4. Await `WsAgentData` frames — collect file contents
+    5. Await `WsAgentComplete` frame — Electron signals done reading
+    6. For each file: call LLM with `prompt_template` + file content → extract structured items
+    7. For each extracted item: send `WsToolCall(insert, table, data)` to Electron → await `WsToolResult`
+       - All inserts include `is_ai_suggested=True, is_approved=False`
+    8. Update `AgentRunLog`: `status=success`, `items_processed`, `items_created`
+  - `async run_cloud_agent(user_id, config: CloudAgentConfig, device_mgr: DeviceConnectionManager)`:
+    1. Check device is online → abort if offline (results must push to Electron)
+    2. Create `AgentRunLog` with `status=running`
+    3. Decrypt OAuth credentials from `config.oauth_token_encrypted`
+    4. Fetch data from cloud provider (Step 3.6):
+       - Gmail: `google-api-python-client` + `filter_config` label/date filters
+       - Teams: `msgraph-sdk` + channel/date filters
+       - Outlook: `msgraph-sdk` + folder/date filters
+    5. For each item: call LLM with `prompt_template` + email/message content → extract structured items
+    6. For each extracted item: send `WsToolCall(insert)` to Electron → await `WsToolResult`
+    7. Update `AgentRunLog`
+  - `async trigger_pending_runs(user_id, device_id, device_mgr)`:
+    - Called when Electron connects (after `device_hello`)
+    - Queries all enabled agent configs where `last_run_at + schedule_interval < now()`
+    - For local agents: only triggers if `config.device_id == device_id`
+    - For cloud agents: triggers regardless of device (any connected device can receive results)
+    - Executes runs sequentially (one at a time to avoid overwhelming the WS)
+  - Error handling: on any failure, update `AgentRunLog` with `status=error` + error details
+- **Files:** `app/core/agent_runner.py`
+- **Outcome:** Backend drives all agent execution — both local (via WS file request) and cloud (direct API calls).
+
+### Step 3.5 — Chatbot Journey endpoint
+- [ ] Create `app/api/routes/agent_setup.py`:
+  - `POST /api/v1/agents/journey/start`:
+    - Body: `{ agent_type: "local"|"cloud", data_types: ["tasks", "notes", ...] }`
+    - Creates a journey session (in-memory or Redis-backed)
+    - Returns first AI message: contextual question based on agent type
+      - Local: "What kind of files are in the directories you want to monitor? (emails, documents, logs, etc.)"
+      - Cloud: "What kind of emails/messages should I look for? (client communications, invoices, meeting notes, etc.)"
+    - Response: `{ session_id, message, done: false }`
+  - `POST /api/v1/agents/journey/message`:
+    - Body: `{ session_id, message }`
+    - AI processes user's answer, asks follow-up questions (max 5 turns)
+    - System prompt: "You are configuring a data extraction agent for a freelancer. Ask about file format, what data to extract (tasks, notes, checkpoints), naming conventions, priority rules, and any special mapping. After 3-5 questions, generate a detailed prompt_template."
+    - When AI determines enough context: `{ session_id, message: "Here's your configuration...", done: true, prompt_template: "..." }`
+    - The `prompt_template` is a structured instruction for the extraction LLM (e.g. "Extract tasks from email. Subject becomes task title. If body contains 'urgent' or 'ASAP', set priority to 'high'. Extract due dates if mentioned.")
+- **Files:** `app/api/routes/agent_setup.py`, `app/main.py`
+- **Outcome:** Users configure AI prompts through guided conversation, not manual text editing.
+
+### Step 3.6 — Cloud provider integrations
+- [ ] Create `app/integrations/gmail.py`:
+  - `GmailClient`:
+    - `__init__(oauth_token)` — initializes Google API client
+    - `async fetch_messages(filter_config, since: datetime) -> list[EmailMessage]`
+    - `EmailMessage`: `{ id, subject, sender, body_text, date, labels }`
+    - Handles token refresh via Google OAuth2 refresh flow
+    - Respects `filter_config.labels`, `filter_config.date_range`, `filter_config.senders`
+- [ ] Create `app/integrations/ms_graph.py`:
+  - `MSGraphClient`:
+    - `__init__(oauth_token)` — initializes MS Graph client
+    - `async fetch_emails(filter_config, since: datetime) -> list[EmailMessage]` (Outlook)
+    - `async fetch_messages(filter_config, since: datetime) -> list[ChatMessage]` (Teams)
+    - `ChatMessage`: `{ id, content, sender, channel, date }`
+    - Handles token refresh via MSAL
+- [ ] Create `app/integrations/__init__.py` — factory: `get_provider(provider_name) -> GmailClient | MSGraphClient`
+- **Dependencies:** `google-api-python-client`, `google-auth-oauthlib`, `msgraph-sdk`, `msal`
+- **Files:** `app/integrations/gmail.py`, `app/integrations/ms_graph.py`, `app/integrations/__init__.py`
+- **Outcome:** Backend can fetch emails/messages from Gmail, Outlook, and Teams.
+
+### Step 3.7 — Agent scheduler
+- [ ] Create `app/core/agent_scheduler.py`:
+  - Uses `APScheduler` (or simple asyncio loop) to check agent schedules
+  - Every 60s: query enabled agents where `last_run_at + cron_interval < now()`
+  - For each due agent:
+    - Check if user's device is online via `DeviceConnectionManager`
+    - If online: dispatch to `agent_runner`
+    - If offline: skip (will trigger on next `device_hello`)
+  - Locks: use PostgreSQL advisory locks to prevent duplicate runs in multi-instance deployments
+- [ ] Integrate with FastAPI lifespan (start scheduler on app startup, shutdown gracefully)
+- **Dependencies:** `apscheduler>=4.0`
+- **Files:** `app/core/agent_scheduler.py`, `app/main.py`
+- **Outcome:** Agents run automatically on their configured schedules.
+
+### Step 3.8 — OAuth flow endpoints
+- [ ] Create `app/api/routes/oauth.py`:
+  - `GET /api/v1/oauth/{provider}/authorize` — returns OAuth authorization URL
+    - Gmail: Google OAuth2 with `gmail.readonly` scope
+    - Outlook/Teams: MS identity platform with `Mail.Read`, `ChannelMessage.Read.All` scopes
+  - `GET /api/v1/oauth/{provider}/callback` — handles OAuth redirect
+    - Exchanges auth code for access + refresh tokens
+    - Encrypts tokens with Fernet (server-side key from settings)
+    - Returns encrypted token blob for storage in `CloudAgentConfig.oauth_token_encrypted`
+  - `POST /api/v1/oauth/{provider}/refresh` — refresh expired OAuth token
+- **Files:** `app/api/routes/oauth.py`, `app/main.py`
+- **Outcome:** Users can connect Gmail/Teams/Outlook accounts securely.
+
+---
+
+### Phase 3 — Verification
+
+| # | Scenario | Expected |
+|---|---|---|
+| 1 | **Agent CRUD** | Create/read/update/delete local and cloud configs; tier limits enforced (free=2, pro=10) |
+| 2 | **WS device connect** | Electron connects → `device_hello` → backend stores connection → triggers overdue runs |
+| 3 | **Local agent run** | Backend sends `agent_run` → Electron reads files → `agent_data` → backend AI extracts → `tool_call(insert)` → Electron persists with `isAiSuggested=1` |
+| 4 | **Cloud agent run** | Backend fetches Gmail → AI extracts tasks → `tool_call(insert)` → Electron persists |
+| 5 | **Device binding** | Local agent config with `device_id=A` only triggers when device A is connected |
+| 6 | **Chatbot Journey** | Start journey → 3-5 Q&A turns → produces valid `prompt_template` |
+| 7 | **Schedule** | Agent with `schedule_cron="0 */6 * * *"` runs every 6h when device is online |
+| 8 | **Offline resilience** | Device offline → runs skipped → device reconnects → overdue runs trigger immediately |
+| 9 | **OAuth flow** | Gmail authorize → callback → token encrypted → stored in config → fetch emails works |
+
+### Phase 3 — New Dependencies
+
+| Package | Purpose |
+|---|---|
+| `google-api-python-client` | Gmail API access |
+| `google-auth-oauthlib` | Gmail OAuth2 flow |
+| `msgraph-sdk` | Outlook + Teams API access |
+| `msal` | MS identity platform auth |
+| `apscheduler>=4.0` | Agent scheduling |
+| `cryptography` (Fernet) | OAuth token encryption at rest |
 - **One step at a time.** Mark `[x]` and commit with `step N.N complete: <outcome>`.