# Dedicated Brief Agent (Home + Project) > Ralph-loop plan. Execute one phase per iteration. Each phase is self-contained: > its **Files**, **Tasks**, **Acceptance**, and **Verify** blocks are everything > the agent needs to finish that phase without re-reading earlier phases. > Mark `- [x]` as you complete tasks. Do not start phase N+1 until phase N's > **Acceptance** is fully met. ## Environment All Python commands in this plan (`pytest`, `python`, `ruff`, `alembic`) must be run inside the `api/` project's virtualenv at `api/.venv`. - **bash / WSL / macOS / Linux**: `source api/.venv/bin/activate` (or prefix commands with `api/.venv/bin/python -m ...`) - **Windows PowerShell**: `api\.venv\Scripts\Activate.ps1` - **Windows bash shell (this repo's default)**: `source api/.venv/Scripts/activate` Do **not** use the system Python or a globally installed `pytest`/`ruff` — dependencies are pinned inside the venv. Every `Verify` step below assumes the venv is active. --- ## Context Today the daily brief is produced by the `home-agent` with a prompt stuffed into `sendHomeRequest()` at [adiuvAI/src/main/ai/orchestrator.ts:160](adiuvAI/src/main/ai/orchestrator.ts#L160). This couples two very different jobs (chat vs summarisation) into one agent and one prompt. Worse: the home agent is wired to emit XML tag wrappers (``, ``, ``) for the UI component renderer — wrappers the brief does not want. We filter them out in post, but the LLM still pays tokens for them and sometimes leaks malformed tags. We want a **dedicated brief agent** that: - Runs in **two modes** — `home` (daily brief) and `project` (per-project status brief). - Produces **plain text only** — no XML/HTML tag wrappers, no bracketed id lists. - Uses the **same infra pattern** as the other agents: `get_agent_llm(...)`, `.env` override, Langfuse prompt via `get_prompt_or_fallback()`, Langfuse tracing via `langfuse_context` + generation observations. - Is **memory-aware** — core memory and relational memory are injected into the system prompt so the brief can say "Client X usually pays late — your invoice is still out" instead of a generic list. - Is **read-only** — no create/update/delete tools. Tool surface is the minimum needed to answer "what needs attention right now?". --- ## Architecture ``` Electron ─ WsFrame{type:"brief_request", mode:"home"|"project", project_id?} ─► device_ws │ ▼ core/brief_agent.py ├── run_home_brief() └── run_project_brief() │ read-only tool subset │ (tasks, projects, notes, │ timelines, memory get) │ ▼ plain-text stream ▲ back to renderer ─┘ ``` **Key decisions:** - New WS frame type `brief_request` — *not* a reuse of `home_request` — so the frame payload stays small and typed, and the server can pick the right agent without sniffing the prompt. - Read-only tools only. Give the LLM access to the same data the UI sees (tasks, projects, notes, timelines, memory **get**). No mutating tools, no memory-write tools — a brief should never change state. - `resolved_project_id` is passed explicitly in the request payload for the `project` mode (no LLM-side resolution). The `home` mode omits it. - Both modes stream — the UI already has streaming rendering for the home brief; we reuse the same pattern for the project brief card. --- ## Improved prompts The current prompt tries to do everything in one paragraph. These split the job into role, data rules, voice, and output contract — the structure the model actually follows. ### `home_brief` (Langfuse prompt, label `production`) ``` You are the user's personal assistant producing a short daily brief. ROLE Act like a calm, attentive secretary writing a stand-up note for your boss. Warm and human, never breezy. Never cheerful filler, never emojis, never "here is your brief" meta-text. The user is opening the app mid-workday and is probably stressed — your job is to lower cognitive load, not add noise. TOOLS — always call before writing Pull fresh data every run. Do not invent counts or titles. Use at minimum: - list_tasks_due_today — tasks the user owes today - list_timeline_events_today — events starting or ending today - list_active_projects — projects currently in progress or at risk - memory_list_blocks / memory_get — personal context about people, clients, payment habits, working preferences If a tool returns nothing, simply omit that topic. Never report zeros. WHAT TO INCLUDE 1. Tasks due today (title + priority; group the 1–2 most important). 2. Timeline events starting or ending today (and anything that starts/ends tomorrow if the user has a very light day). 3. Active projects that need a nudge — stalled, blocked, or awaiting input. 4. Memory-aware colour where it sharpens the brief. Examples: - "Client Rossi tends to pay late — the Acme invoice is 6 days out." - "You usually dislike meetings before 10:00 — the call at 09:30 is unusual." Only add a memory line when it changes what the user does. Do not pad. WHAT TO OMIT - Zero-counts ("no overdue items", "0 meetings today"). - Statistics ("2 active projects, 3 completed tasks"). - Headers, titles, greetings, sign-offs, dates, emojis, slang. - Meta-phrases ("here is", "let me know if", "hope this helps"). - XML/HTML tags of any kind. Plain prose only. LIGHT-DAY CLAUSE If tasks + events + active-project-nudges together produce fewer than two sentences of content, also list 1–2 projects in status `on_hold` or `waiting` and ask a single, specific question about them — e.g. "Is the Bianchi redesign still paused, or ready to pick back up?" One question max, grounded in a real project name. VOICE - Calm. Concise. Human. Short sentences. - Use **bold** sparingly for task titles, project names, and people's names. - No bullet lists. Flow as 2–4 sentences of prose. LENGTH 2–4 sentences total. Hard cap 4. If the day is truly empty, one sentence. Respond in the user's language ({{language}}). Today is {{today}}. ``` Variables: `{{language}}` (e.g. "Italian"), `{{today}}` (ISO date). ### `project_brief` (Langfuse prompt, label `production`) ``` You are the project assistant producing a short status brief for ONE project. ROLE A senior project manager summarising state-of-play for the owner. Factual, sharp, forward-looking. Never reassuring filler, never emojis. SCOPE Work only with project_id = {{project_id}}. Do not mention or pull data from other projects. Use tools to fetch fresh data: - get_project — current status, dates, description - list_tasks(project_id) — open work, split by status - list_timeline_events(project_id) — milestones hit, upcoming, overdue - list_project_notes(project_id) — any recent decisions or blockers - memory_get — relevant context about the client, collaborators, constraints STRUCTURE — follow exactly, one short paragraph per section, no headers 1. **State.** One sentence: current phase, health (on track / at risk / blocked), and why. Cite the concrete signal (overdue milestone, stalled tasks, recent blocker note). 2. **What's moving.** What was completed or progressed recently. Name specific tasks or milestones. 3. **Next steps.** The 1–3 most important things the user should do next, in priority order. Be concrete — task name, who owns it, when due if known. If waiting on someone else, name them and what the ask is. 4. **Risks / memory-flagged items.** One line max. Only include when there is a real risk or a relevant memory (e.g. late-paying client, tight deadline, scope change). Omit the section entirely if nothing to say. WHAT TO OMIT - Zero-counts ("no overdue tasks"). - Generic advice ("keep up the good work"). - Greetings, headers, bullet lists, emojis, sign-offs, meta-phrases. - XML/HTML tags or bracketed id lists. Plain prose only. VOICE - Direct. Factual. No fluff. - Use **bold** sparingly for task titles, milestone names, and the owner's name. - Short sentences. Prefer verbs over nouns ("Client review is blocking release" not "There is a blocker which is the client review"). LENGTH 4–8 sentences total across the 3–4 sections. Hard cap 8. Respond in the user's language ({{language}}). Today is {{today}}. ``` Variables: `{{project_id}}`, `{{language}}`, `{{today}}`. --- ## Phase 1 — Backend config scaffolding **Goal:** `LLM_MODEL_BRIEF_AGENT` resolvable via `get_agent_llm("brief-agent")`. **Files** - `api/app/config/settings.py` - `api/app/core/llm.py` - `api/.env.example` **Tasks** - [ ] Add field `LLM_MODEL_BRIEF_AGENT: str = ""` to `Settings` after `LLM_MODEL_CLOUD_PROCESSOR`. - [ ] Add `"brief-agent": lambda: settings.LLM_MODEL_BRIEF_AGENT or settings.LLM_MODEL` entry to `_AGENT_MODEL_SETTINGS` in `llm.py`. - [ ] Add a commented-out `LLM_MODEL_BRIEF_AGENT=` block in `.env.example`, with a 2-line description mirroring the existing style ("Brief-agent — produces home and project text briefs. A small model (e.g. gpt-4o-mini) is sufficient."). **Acceptance** - `python -c "from app.core.llm import model_for_agent; print(model_for_agent('brief-agent'))"` prints the default model (matches `LLM_MODEL`) when the override is empty; prints the override when set. - `ruff check .` passes. **Verify** - `cd api && source .venv/Scripts/activate && python -c "from app.core.llm import model_for_agent; print(model_for_agent('brief-agent'))"` - `cd api && source .venv/Scripts/activate && ruff check .` --- ## Phase 2 — Brief-agent module (read-only tool subset) **Goal:** `run_home_brief()` and `run_project_brief()` callables exist and work end-to-end against a live backend, producing plain-text streams. No WS wiring yet — exercised via a `scripts/smoke_brief.py` one-liner. **Files (new)** - `api/app/core/brief_agent.py` **Files (touched)** - `api/app/agents/task_agent.py` — export a `TASK_READ_TOOLS` list (`list_tasks`, `list_tasks_due_today`, `list_task_comments`). - `api/app/agents/project_agent.py` — export a `PROJECT_READ_TOOLS` list (`list_projects`, `list_all_projects`, `get_project`). - `api/app/agents/timeline_agent.py` — export a `TIMELINE_READ_TOOLS` list (`list_timelines`, plus a new `list_timelines_today` that filters by today — add it alongside the existing tools) and a `list_timeline_events` alias scoped by `project_id`. - `api/app/agents/note_agent.py` — export a `NOTE_READ_TOOLS` list (`list_notes`, `get_note`). **Tasks** - [x] Add the four `*_READ_TOOLS` exports in the agent files. Do not remove the existing `*_TOOLS` exports — the chat agents still use them. - [x] Add `list_timelines_today` in `timeline_agent.py`: returns only timelines whose `date` falls on today (UTC). Mirror the shape of `list_tasks_due_today`. - [x] Create `brief_agent.py` with: - Module-level fallback prompt constants `_HOME_BRIEF_FALLBACK` and `_PROJECT_BRIEF_FALLBACK` — copy the prompts from the plan above verbatim, using `{language}` / `{today}` / `{project_id}` (single-brace) so `.format()` works when Langfuse is unavailable. - Read-only memory tools subset: reuse `_memory_tools()` from `deep_agent.py` but filter to `memory_list_blocks`, `memory_get`, `archival_memory_search`, `conversation_search`. Factor out a small helper `_read_only_memory_tools()` in `deep_agent.py` (or duplicate locally — keep it simple). - `async def run_home_brief(user_id, context) -> AsyncGenerator[tuple[str, Any], None]` - `async def run_project_brief(user_id, project_id, context) -> AsyncGenerator[tuple[str, Any], None]` - Both reuse `_run_single_agent_stream` from `deep_agent.py` by passing `agent_name="brief-agent"` and the relevant prompt. Tool list is the read-only subset. - Inject `_language_instruction`, `_relational_memory_injection`, and `_proactive_hints_injection` into the system prompt — same pattern as `run_home_stream`. - After rendering the system prompt with `compile_prompt`, append a line `"\nToday is YYYY-MM-DD."` only if the Langfuse template did not already include `{{today}}` substitution (safe fallback). - [x] Do **not** call `_normalize_tagged_list_lines` on the output — the brief prompt forbids tags, so skipping the post-processor is a deliberate signal of correctness. **Acceptance** - Importing `from app.core.brief_agent import run_home_brief, run_project_brief` succeeds. - A smoke script `scripts/smoke_brief.py` (create it; git-ignore it; OK to delete afterwards) runs `run_home_brief` against a seeded test user and streams text to stdout. Output contains no `<` or `[uuid]` substrings. - `ruff check .` passes. **Verify** - `cd api && source .venv/Scripts/activate && python scripts/smoke_brief.py home` - `cd api && source .venv/Scripts/activate && python scripts/smoke_brief.py project ` - `cd api && source .venv/Scripts/activate && ruff check .` --- ## Phase 3 — WS frame + REST fallback **Goal:** Electron can send `{type:"brief_request", mode, project_id?}` over the device WS and receive a plain-text stream. REST `POST /chat/brief` exists as fallback. **Files** - `api/app/schemas.py` - `api/app/api/routes/device_ws.py` - `api/app/api/routes/chat.py` **Tasks** - [ ] In `schemas.py`: add `brief_request = "brief_request"` to `WsFrameType`, and a `WsBriefRequest` model with fields `type: Literal[WsFrameType.brief_request]`, `request_id: str | None`, `session_id: str | None`, `mode: Literal["home", "project"]`, `project_id: str | None`. - [ ] In `device_ws.py`: add an `elif frame_type == WsFrameType.brief_request:` branch that dispatches to a new `_handle_brief_request` task. - [ ] Implement `_handle_brief_request` by mirroring `_handle_home_request` but: - Call `run_home_brief(user_id, context)` when `mode == "home"`, `run_project_brief(user_id, project_id, context)` when `mode == "project"` (validate `project_id` is a UUID; send `stream_end` with error frame otherwise). - **Skip** episode storage — briefs are not conversations. - Still run `memory.enrich_context(...)` so relational/proactive memory is injected. - [ ] In `chat.py`: add `POST /chat/brief` that accepts `{mode, project_id?}` and returns the full text (collects stream). This is the offline fallback path used when the WS is not ready. **Acceptance** - Electron smoke client opens the WS, sends a `brief_request` with `mode:"home"`, and receives `stream_start` → N × `stream_text` → `stream_end` frames. - `POST /chat/brief` returns `{response: "..."}`. - Malformed `project_id` → WS frame `stream_end` with an error message (no server crash). **Verify** - Run pytest existing suite: `cd api && source .venv/Scripts/activate && pytest -q`. - Add one unit test `tests/test_brief_agent.py` covering: home mode returns non-empty text; project mode with bogus UUID returns an error without crashing; tools called are from the read-only subset (monkeypatch `run_home_brief` to assert the tool list). - Then: `cd api && source .venv/Scripts/activate && pytest tests/test_brief_agent.py -v`. --- ## Phase 4 — Langfuse prompts **Goal:** `home_brief` and `project_brief` prompts exist in Langfuse at label `production`, matching the content in this plan. **Files** - None (external config via MCP). **Tasks** - [ ] Use `mcp__langfuse-docs__searchLangfuseDocs` to confirm the text-prompt variable syntax (`{{variable}}`) and that `label="production"` is the label read by `get_prompt_or_fallback`. - [ ] Use `mcp__langfuse__createTextPrompt` to create `home_brief` with the content from the "Improved prompts → home_brief" section above. Set label to `production`. Variables: `language`, `today`. - [ ] Use `mcp__langfuse__createTextPrompt` to create `project_brief` with the content from the "Improved prompts → project_brief" section above. Set label to `production`. Variables: `language`, `today`, `project_id`. - [ ] Use `mcp__langfuse__getPrompt` to round-trip both prompts and verify the raw template matches what was sent. **Acceptance** - Both prompts resolve via `get_prompt_or_fallback("home_brief", "")` and `get_prompt_or_fallback("project_brief", "")` in a Python shell against the real Langfuse instance — return a non-empty `raw_template` and a non-None `prompt_obj`. - `prompt_obj.compile(language="Italian", today="2026-04-17")` returns text containing the Italian directive and the date. **Verify** - `cd api && source .venv/Scripts/activate && python -c "from app.core.langfuse_client import get_prompt_or_fallback; t,p = get_prompt_or_fallback('home_brief', ''); print(len(t), p is not None)"` --- ## Phase 5 — Electron client: home brief uses new agent **Goal:** The existing home-brief UI flow (toast + full card) calls the new brief agent over WS, and the `DAILY_BRIEF_PROMPT` constant is deleted. **Files** - `adiuvAI/src/shared/api-types.ts` (or wherever WS types live) - `adiuvAI/src/main/api/backend-client.ts` - `adiuvAI/src/main/ai/orchestrator.ts` - `adiuvAI/src/main/router/index.ts` **Tasks** - [ ] Add `WsBriefRequest` frame shape to shared types, mirroring the API `WsBriefRequest` schema. - [ ] In `backend-client.ts`, add `sendBriefRequest(mode, projectId?, callbacks, requestId?)` modeled on `sendHomeRequest`. It sends `{type:"brief_request", mode, project_id}`. - [ ] In `orchestrator.ts`: - Delete the `DAILY_BRIEF_PROMPT` constant (and the `langSuffix` hack — the backend now owns language injection). - `generateAndCacheBrief()` → call `client.sendBriefRequest("home", undefined, {...})`. - `dailyBrief()` → call `client.sendBriefRequest("home", undefined, {...}, requestId)`. - [ ] `router/index.ts`: no signature change — only the underlying orchestrator was rewired. Leave `ai.dailyBrief` mutation as-is. **Acceptance** - Launch the Electron app, open Home, brief renders within 10s. No ``/`` markers appear in the output. Italian UI user gets Italian prose. - Grep confirms `DAILY_BRIEF_PROMPT` no longer exists in the repo. **Verify** - `cd adiuvAI && npm run lint` - Manual: Home page renders a fresh brief. Toggle `isHomePage` nav away/back to check the cache path still works. --- ## Phase 6 — Project brief UI card **Goal:** Each project page has a "Brief" card that calls `sendBriefRequest("project", id, ...)` and renders streaming plain text. **Files** - `adiuvAI/src/renderer/components/projects/ProjectDetail.tsx` - `adiuvAI/src/renderer/components/projects/ProjectBriefCard.tsx` (new) - `adiuvAI/src/main/router/index.ts` (add `ai.projectBrief` mutation) - `adiuvAI/src/main/ai/orchestrator.ts` (add `projectBrief(sender, projectId, requestId)`) - Locale files (all 5). **Tasks** - [ ] Add `projectBrief(sender, projectId, requestId)` in `orchestrator.ts` mirroring `dailyBrief` but with `mode:"project"` and no cache (cheap enough to regenerate on demand; add a simple in-memory TTL of 5 minutes keyed by `projectId` only if the UX feels laggy). - [ ] Add `ai.projectBrief` mutation in the tRPC router, input `{projectId: z.string().uuid(), requestId: z.string().optional()}`. - [ ] `ProjectBriefCard`: shadcn Card, Sparkles icon, `text-sm` body. States: `idle` (button "Generate brief") → `streaming` (skeleton + partial text) → `ready` (full text + "Refresh" button). Stream via `window.electronAI.onStreamChunk()` by request id. - [ ] Mount `ProjectBriefCard` at the top of `ProjectDetail`, above existing content. - [ ] Add i18n keys under `projects.brief.*`: `title`, `generate`, `refresh`, `generating`, `error`. Add to all 5 locale files. **Acceptance** - On a project with tasks/timelines, the card streams a coherent 4–8 sentence brief with state / what's moving / next steps sections, no XML tags. - On an empty project (no tasks/timelines), the brief is still coherent and does not hallucinate. - Refresh button produces a fresh generation (new `request_id`). **Verify** - `cd adiuvAI && npm run lint` - Manual: navigate `/projects?projectId=`, click Generate. --- ## Phase 7 — Observability + cleanup **Goal:** The brief agent is visible in Langfuse as its own generation name, and the old hard-coded prompt is fully removed. **Files** - `api/app/core/brief_agent.py` - `adiuvAI/.claude/CLAUDE.md` — document the new agent - `.claude/CLAUDE.md` (root) — add a short line under "api" section **Tasks** - [ ] Verify that `_run_single_agent_stream` uses `agent_name="brief-agent"` so the Langfuse span/generation is named accordingly. Spot-check one trace in the Langfuse UI. - [ ] In `adiuvAI/.claude/CLAUDE.md`, add under the "AI Subsystem" section a new bullet for `brief-agent` in the agents table (Scope: "Daily home brief and per-project status brief", Tools: read-only subset). - [ ] In the root `.claude/CLAUDE.md`, under the `api/` architecture section, add `brief_agent.py` to the "Orchestration" list with a one-line purpose. - [ ] Delete `scripts/smoke_brief.py` if it was committed. **Acceptance** - A Langfuse trace for a home brief shows span `brief-agent-stream` containing a generation `brief-agent-llm` linked to the `home_brief` prompt version. - `rg -n "DAILY_BRIEF_PROMPT" adiuvAI/` returns no matches. - `rg -n "home_brief|project_brief" api/` shows usages only in `brief_agent.py`. **Verify** - Trigger one home brief and one project brief, open Langfuse, confirm traces. --- ## Phase 8 — Regression + doc polish **Goal:** existing home chat behavior unchanged; brief behavior documented for future contributors. **Tasks** - [ ] Open the home chat, send a normal message ("what are my tasks today?"). Response must still include `` tag lines (the chat agent still uses the tag contract; only the brief agent does not). - [ ] Open the floating panel on a task. Response must still be plain text with no tags (existing contract). - [ ] Add a short "Daily Brief" paragraph to the user-facing docs (if any marketing/help doc exists — otherwise skip). **Acceptance** - No regressions in home chat or floating chat. - Plan document (`docs/plan-brief-agent.md`) is marked complete: every phase's task checklist is `- [x]`. **Verify** - Manual QA pass of: home brief, project brief, home chat, floating chat on task, floating chat on project. --- ## Out of scope (explicitly) - Push notifications / proactive brief delivery (already a separate plan). - Weekly / monthly brief variants. - Brief export to PDF / email. - Per-client brief (clients are currently a lightweight table, not a UI page). - Writing tools in the brief agent — it stays read-only. If the user acts on the brief ("create a task for X"), they send that as a normal home-chat message, and the home agent handles it.