Commit Graph

41 Commits

Author SHA1 Message Date
Roberto Musso
1ce1d492b0 Add Langfuse observability: traces, prompt management, prompt-to-generation linking
- New app/core/langfuse_client.py: lazy singleton client, get_prompt_or_fallback()
  helper (returns raw template + prompt obj for linking), extract_usage() for token
  counts. No-ops when LANGFUSE_* env vars are not set.
- deep_agent.py: home-agent and floating-agent runs wrapped in spans; each ainvoke
  wrapped in a generation with model/input/output/usage; prompts fetched from
  Langfuse (adiuva-home-agent, adiuva-floating-agent, adiuva-floating-classifier)
  with hardcoded fallback.
- agent_runner.py: step1-classifier and step2-processor LLM calls traced; batch
  agent _run_agent_with_tools spans + generations; cloud-processor included.
  Prompts: adiuva-step1-classifier, adiuva-step2-processor, adiuva-cloud-processor.
- agent_setup.py: journey-setup span + generation per ainvoke; prompt_obj stored
  on JourneySession and reused across turns. Prompt: journey_system.
- settings.py: LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, LANGFUSE_HOST added.
- .env.example: Langfuse section with EU/US/self-hosted host comments.
- requirements.txt: langfuse>=2.0.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 00:19:20 +02:00
Roberto Musso
552b8eb305 Fix project creation: code-based in runner, not delegated to Step 2 LLM
Root causes fixed:
1. PROJECT_TOOLS removed from Step 2 tool set — project assignment is now
   exclusively handled by the runner in code, never by the LLM.
2. When Step 1 returns "new", runner calls execute_on_client insert/projects
   directly (before Step 2), gets the created id, and passes it as context.
3. Newly created projects are appended to the local `projects` list so that
   subsequent files in the same run can match to them via Step 1 — prevents
   one project per file when multiple files share the same topic.

Also add tests/test_classify_file.py with pytest cases for _classify_file
and a CLI runner: python -m tests.test_classify_file <file> [project...]

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 23:40:38 +01:00
Roberto Musso
e7cdce8287 Improve Step 1 project matching and Step 2 update-first enforcement
- Rewrite _STEP1_SYSTEM_PROMPT: lower matching threshold (no longer requires
  "clear" match), strongly prefer existing projects over creating new ones,
  use structured id=|name=|status= format with aiSummary for richer context
- Add code-level UUID validation: reject hallucinated ids not in the fetched
  projects list, fall back to "new" instead of creating a bad link
- Rewrite _PROCESSING_SYSTEM_PROMPT: enforce explicit scan-before-create
  process (read existing → search → update if found → create only if not)
  with hard rule against calling create_* without checking existing records

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 23:45:29 +01:00
Roberto Musso
58bc6efd4b Rewrite run_local_agent: code-based flow, concurrency guard, remove isApproved
- Replace LLM-driven triage with code-based directory scan and project fetch
- Two-step LLM approach: Step 1 classifies file→project+domains, Step 2 processes with tools
- Add domain descriptions to Step 1 prompt for better extraction accuracy
- Add _running_agents set for per-agent concurrency guard (one running instance per agent)
- Return 409 from route before DB write when agent already running
- Remove is_approved from task_agent create/update tools and system prompt
- Remove is_approved from timeline_agent create/update tools and system prompt

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 22:21:30 +01:00
Roberto Musso
725cece5c1 Add run_context to agent tool calls for FE run logging
- AgentTriggerRequest accepts optional agent_id (FE's stable electron-store UUID)
- _make_agent_executor injects run_context into every tool_call frame
  so Electron can attribute actions to the correct agent run
- run_local_agent accepts run_context and sends a run_complete WS frame
  when the run finishes so the FE can close the run record
- trigger_agent_run builds run_context with run_id=run_log.id and the
  stable agent_id, passes it through to run_local_agent

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 09:46:17 +01:00
Roberto Musso
5a03bd1cfb Clean up agent catalog and improve extraction agent prompts
- Remove unused config_schema from AgentCatalogItem (schema + route)
- Fix agent_setup system prompt: add extraction agent base behaviour
  context so journey LLM knows what is already handled and focuses on
  field mappings only; remove redundant data-types question (already
  known from user selection); derive data types list dynamically
- Rewrite processing base prompt to use actual tool names
  (list_tasks, update_task, add_task_comment, list_notes, update_note,
  list_timelines, update_timeline, list_all_projects, create_project)
  and enforce update-first strategy before falling back to creation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 23:52:54 +01:00
Roberto Musso
826f64d6bb refactor local directory agent to two-phase LLM-with-tools architecture
Replace the single-pass FE-driven agent_run/agent_data flow with a
BE-orchestrated two-phase execution using LangChain tool-calling:
- Phase 1 (Triage): explores directory via new filesystem tools, matches
  files to existing projects using PROJECT_TOOLS
- Phase 2 (Processing): reads files and performs CRUD per project group
  with clean LLM context windows

Key changes:
- Add filesystem_agent.py with list_directory, read_file_content,
  get_file_metadata tools using execute_on_client()
- Move setup journey from REST to WebSocket (journey_start/message frames)
- Add batch_runs_per_day billing limit and enforce in /trigger
- Remove deprecated agent_data/agent_complete frame handlers and queues

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 08:50:46 +01:00
5faa6b1d7c refactor agents to client-owned config flow 2026-03-16 22:35:46 +01:00
02a9684cd6 scope episodic memory enrichment by session_id 2026-03-16 00:33:11 +01:00
30b062dd4a fix floating stream empty responses with sanitizer-safe fallbacks 2026-03-13 16:57:30 +01:00
2a0331d7ce refactor floating_domain to structured object-only payload 2026-03-13 16:09:24 +01:00
13fd8677c1 fix: normalize home task/timeline responses to tag-only lines 2026-03-13 12:16:58 +01:00
9bd629cb59 chore: add interaction tracing and remove personal fields from logs 2026-03-13 10:23:47 +01:00
9c97702daa feat: add letta-style memory tools with request/user debug tracing 2026-03-13 09:34:23 +01:00
a1e364c9c0 refactor: switch to single-agent deep runner and add mock memory/tool tests 2026-03-13 08:20:42 +01:00
5b55f1292a make a single agent 2026-03-13 07:42:36 +01:00
5bc9ea6cd6 fix: make planner schema copilot-compatible and silence usage warning 2026-03-12 23:17:31 +01:00
f7404b6f66 refactor: move memory updates from synthesizer to orchestrator node 2026-03-12 23:03:38 +01:00
d667e43c73 refactor: use native LangGraph streaming and enforce structured summary on workers 2026-03-12 22:50:32 +01:00
fe085a7951 feat: migrate chat orchestration to deep langgraph workers 2026-03-12 22:25:36 +01:00
2de67213f8 rename from checkpoint to timeline agent 2026-03-10 23:17:38 +01:00
9332e29e53 bug fix sending component 2026-03-10 09:11:24 +01:00
34f01234c9 rename popup chat to floating chat 2026-03-08 22:53:31 +01:00
e6b5bc2e7d step-7: add memory middleware (memory_middleware.py, device_ws.py)
MemoryMiddleware class:
- enrich_context(): loads core prefs, associative (top-k), episodic (last-N),
  and proactive hints (above 0.6 confidence) — all decrypted in-memory only
- store_episode(): encrypts and persists interaction summary to memory_episodic
- update_core(): upserts encrypted key/value to memory_core

device_ws.py home_request + popup_request handlers:
- enrich_context() called before orchestrate_v3_stream (memory injected into context)
- store_episode() called after stream completes (non-blocking)

10 unit + integration tests pass; pre-existing test_agents.py failures unrelated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 22:14:28 +01:00
393b3befd6 step-4: add output formatting layer (output_formatter.py)
HomeFormatter parses JSON block stream from orchestrator tokens and emits
stream_start / stream_text / stream_block / stream_end frames.
PopupFormatter emits popup_domain then plain stream_text.
All 13 unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 21:51:20 +01:00
2c08275934 step-3: add router refactor with streaming support (orchestrator.py)
- orchestrate_v3(user_id, message, context): classifies intent, returns
  (agent_name, agent_instance) — caller drives execution
- orchestrate_v3_stream(user_id, message, context): yields (agent_name, token)
  pairs; first yield is always (agent_name, "") as a domain-detection signal
- ChatAgent.handle_stream(): default implementation yields handle() result as
  one chunk; subclasses override for true token-level streaming
- Fix stale test_orchestrator.py assertions that expected a JSON final frame
  that orchestrate_stream never emitted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 21:42:46 +01:00
7cb384fa63 step-2: add agent streaming and tool result capture (agent_registry.py)
- ChatAgent.__init__: adds tool_results: list[dict] = []
- _tool_loop: wraps execution in a result collector; populates
  self.tool_results with raw execute_on_client dicts after each run
- _tool_loop_stream: streaming variant — uses ainvoke for tool-call
  iterations, llm.astream() for the final answer; same result capture
- ws_context.py: adds _tool_result_collector ContextVar +
  set/clear helpers; execute_on_client appends to collector when set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 21:37:15 +01:00
ac71d99f9a add cerebras models 2026-03-08 00:53:25 +01:00
a775a2da18 feat(step-3.6): cloud provider integrations (Gmail, Outlook, Teams)
- Add app/integrations/__init__.py: Fernet token encryption helpers,
  EmailMessage/ChatMessage dataclasses, get_provider() factory
- Add app/integrations/gmail.py: GmailClient with async fetch_messages(),
  token refresh, configurable label/sender/date filters
- Add app/integrations/ms_graph.py: MSGraphClient with fetch_emails()
  (Outlook) and fetch_messages() (Teams), MSAL token refresh, OData filters
- Update app/core/agent_runner.py: replace run_cloud_agent() stub with
  full 8-step implementation; extend _finalize_run() for cloud config type
- Update app/config/settings.py: add OAuth + Fernet encryption settings
- Update requirements.txt: google-api-python-client, google-auth-*,
  msal, cryptography
- Add tests/test_integrations.py: 47 tests covering all integration code
- Update tests/test_agent_runner.py: replace stub test with 7 real tests

All 76 new/updated tests pass.
2026-03-05 18:05:07 +01:00
914f70bd85 step 3.4 complete: agent run orchestrator — local/cloud runner + trigger_pending_runs + 23 tests 2026-03-05 16:13:21 +01:00
608d6c784f step 3.3 complete: device WS endpoint + DeviceConnectionManager 2026-03-05 15:51:58 +01:00
6d9a16e513 steps B.3/B.4/B.5 complete: bidirectional WS handler, _tool_loop verified, clean final frame 2026-03-05 00:06:11 +01:00
27c087d5d8 step B.2 complete: all 23 tools use execute_on_client(); add embed() to llm 2026-03-05 00:03:01 +01:00
rmusso
4d7fd519c5 step B.1 complete: WS context + frame schemas 2026-03-04 23:59:31 +01:00
314780d59a Add LLM configuration options and update deployment workflow
- Introduced new API keys for Anthropic and Google in .env.example and settings.py
- Updated llm.py to retrieve API keys directly from settings
- Modified deploy.yaml to streamline code checkout and improve deployment process
2026-03-03 16:52:56 +01:00
8bfce9da00 Refactor LLM instantiation across agents and orchestrator
- Replaced direct instantiation of ChatOpenAI with a centralized get_llm function in CheckpointAgent, NoteAgent, ProjectAgent, and TaskAgent.
- Introduced a new llm.py module to handle LLM model instantiation and API key management.
- Updated settings.py to include LLM_MODEL and LLM_ROUTER_MODEL configurations.
- Modified orchestrator.py to use get_router_llm for intent classification.
- Updated requirements.txt to include litellm for LLM management.
- Adjusted tests to mock get_llm instead of ChatOpenAI directly.
2026-03-03 15:46:44 +01:00
c8ef7b119b Refactor tests for execution plan and add comprehensive storage tests
- Updated `TestModuleSingletons` in `test_execution_plan.py` to reflect new agent templates and playbook names.
- Changed assertions in playbook tests to match updated templates and agents.
- Introduced `test_storage.py` to cover the storage layer, including encryption, BlobStore, and VectorStore functionalities.
- Added tests for S3 interactions, ensuring upload, download, delete, and list operations work as expected.
- Implemented mock tests for Pinecone and Qdrant vector stores to validate upsert, search, and delete operations.
2026-03-02 15:36:09 +01:00
14d1a7351d step 5 complete: execution plan builder, template registry, and LRU plan cache
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 13:13:02 +01:00
68955d2fc2 step 4 complete: intelligent routing with single-agent and pipeline modes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 13:03:54 +01:00
0d16729036 step 3 complete: pluggable agent framework
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 00:03:42 +01:00
4d0917f5df step 1 complete: runnable FastAPI skeleton
- Full directory structure with all __init__.py stubs
- requirements.txt with all pinned dependencies
- app/config/settings.py (BaseSettings, env-based)
- app/main.py (CORS, lifespan, /api/v1/health)
- Dockerfile (multi-stage, Python 3.12-slim, non-root user)
- docker-compose.yml (app + postgres:16 with healthcheck)
- .env.example
- BACKEND_PLAN.md: mark step 1 done, add one-step-at-a-time rule

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 23:51:37 +01:00