adiuvAI/api - api - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Roberto Musso	3cf067faea	feat: enhance agent configuration and model management with per-agent overrides	2026-04-10 08:45:14 +02:00
Roberto Musso	41db3a7089	update env variables	2026-04-08 23:52:52 +02:00
Roberto Musso	e672b58b6f	fix(langfuse): remove invalid user_id/session_id kwargs from start_as_current_observation Langfuse V3 does not accept user_id/session_id on observation-level calls. Moved to metadata dict in agent_runner, deep_agent, and agent_setup. refactor(tests): fixture-based pattern for agent_runner_v2 eval tests - cases.yaml + data/ fixtures under tests/fixtures/agent_runner_v2/ - pytest_generate_tests parametrizes test_eval_runner from YAML - _resolve_projects() handles symbolic names and inline dicts - _evaluate_case() centralizes all assertion logic - --runner-dir CLI option for custom fixture folders Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 00:45:15 +02:00
Roberto Musso	3aa0b36a6c	fix(langfuse): use compile() instead of .format() for prompt variable injection Langfuse uses {{variable}} syntax in its prompt management UI, while the hardcoded fallbacks use {variable} (Python str.format). The previous code always called .format() which silently failed/errored when a real Langfuse prompt was fetched. - langfuse_client.py: add compile_prompt(template, prompt_obj, vars) → uses prompt_obj.compile(vars) when Langfuse is available → falls back to template.format(**vars) when using the hardcoded fallback - agent_runner.py: replace .format() with compile_prompt() for unified_processing (V2 local) and batch_cloud_processing (cloud agent) - agent_setup.py: replace .format() with compile_prompt() for journey_system deep_agent.py prompts have no variables, so no change needed there. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 16:49:26 +02:00
Roberto Musso	fa231a3642	feat(local-agent-v2): step 2+3 — unified runner + AgentConfig schema Step 3 (prerequisite): - app/schemas.py: add ContentTypeConfig + AgentConfig Pydantic models - app/models.py: add agent_config (JSON, nullable) to LocalAgentConfig - alembic migration a3b9c0d1e2f3: ADD COLUMN agent_config Step 2 (runner refactor): - Remove _classify_file() and _BATCH_FILE_CLASSIFIER_PROMPT (LLM classification step) - Add Phase A: detect_content_type + preprocess (zero LLM, per file) - Add _UNIFIED_PROCESSING_PROMPT (hot-swappable via Langfuse "unified_processing") - Add helper functions: _format_projects, _format_metadata, _get_extraction_rules, _get_no_match_behavior - Single LLM call per file with tools (classify + extract + create) - Fix items_created: count create_* tool calls via _tool_calls_out param - test_agent_runner_v2.py: 10 cases (2.1-2.10) with Langfuse eval scoring Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 15:00:32 +02:00
Roberto Musso	a2d6d689e4	feat: add preprocessor system (Step 1 — Local Agent V2) - app/core/preprocessors/__init__.py: detect_content_type + preprocess dispatcher - app/core/preprocessors/base.py: PreprocessResult dataclass - app/core/preprocessors/email_html.py: BeautifulSoup HTML stripping, metadata extraction, thread splitting - requirements.txt: add beautifulsoup4 and lxml - tests/test_preprocessors.py: 10 tests with Langfuse scoring (preprocess.* scores) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 10:19:02 +02:00
Roberto Musso	aa8bcbf0d8	Refactor system prompt variables for clarity and consistency across agent setup and runner modules	2026-04-07 00:23:41 +02:00
Roberto Musso	1ce1d492b0	Add Langfuse observability: traces, prompt management, prompt-to-generation linking - New app/core/langfuse_client.py: lazy singleton client, get_prompt_or_fallback() helper (returns raw template + prompt obj for linking), extract_usage() for token counts. No-ops when LANGFUSE_* env vars are not set. - deep_agent.py: home-agent and floating-agent runs wrapped in spans; each ainvoke wrapped in a generation with model/input/output/usage; prompts fetched from Langfuse (adiuva-home-agent, adiuva-floating-agent, adiuva-floating-classifier) with hardcoded fallback. - agent_runner.py: step1-classifier and step2-processor LLM calls traced; batch agent _run_agent_with_tools spans + generations; cloud-processor included. Prompts: adiuva-step1-classifier, adiuva-step2-processor, adiuva-cloud-processor. - agent_setup.py: journey-setup span + generation per ainvoke; prompt_obj stored on JourneySession and reused across turns. Prompt: journey_system. - settings.py: LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, LANGFUSE_HOST added. - .env.example: Langfuse section with EU/US/self-hosted host comments. - requirements.txt: langfuse>=2.0.0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 00:19:20 +02:00
Roberto Musso	552b8eb305	Fix project creation: code-based in runner, not delegated to Step 2 LLM Root causes fixed: 1. PROJECT_TOOLS removed from Step 2 tool set — project assignment is now exclusively handled by the runner in code, never by the LLM. 2. When Step 1 returns "new", runner calls execute_on_client insert/projects directly (before Step 2), gets the created id, and passes it as context. 3. Newly created projects are appended to the local `projects` list so that subsequent files in the same run can match to them via Step 1 — prevents one project per file when multiple files share the same topic. Also add tests/test_classify_file.py with pytest cases for _classify_file and a CLI runner: python -m tests.test_classify_file <file> [project...] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-21 23:40:38 +01:00
Roberto Musso	e7cdce8287	Improve Step 1 project matching and Step 2 update-first enforcement - Rewrite _STEP1_SYSTEM_PROMPT: lower matching threshold (no longer requires "clear" match), strongly prefer existing projects over creating new ones, use structured id=\|name=\|status= format with aiSummary for richer context - Add code-level UUID validation: reject hallucinated ids not in the fetched projects list, fall back to "new" instead of creating a bad link - Rewrite _PROCESSING_SYSTEM_PROMPT: enforce explicit scan-before-create process (read existing → search → update if found → create only if not) with hard rule against calling create_* without checking existing records Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 23:45:29 +01:00
Roberto Musso	58bc6efd4b	Rewrite run_local_agent: code-based flow, concurrency guard, remove isApproved - Replace LLM-driven triage with code-based directory scan and project fetch - Two-step LLM approach: Step 1 classifies file→project+domains, Step 2 processes with tools - Add domain descriptions to Step 1 prompt for better extraction accuracy - Add _running_agents set for per-agent concurrency guard (one running instance per agent) - Return 409 from route before DB write when agent already running - Remove is_approved from task_agent create/update tools and system prompt - Remove is_approved from timeline_agent create/update tools and system prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 22:21:30 +01:00
Roberto Musso	725cece5c1	Add run_context to agent tool calls for FE run logging - AgentTriggerRequest accepts optional agent_id (FE's stable electron-store UUID) - _make_agent_executor injects run_context into every tool_call frame so Electron can attribute actions to the correct agent run - run_local_agent accepts run_context and sends a run_complete WS frame when the run finishes so the FE can close the run record - trigger_agent_run builds run_context with run_id=run_log.id and the stable agent_id, passes it through to run_local_agent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 09:46:17 +01:00
Roberto Musso	5a03bd1cfb	Clean up agent catalog and improve extraction agent prompts - Remove unused config_schema from AgentCatalogItem (schema + route) - Fix agent_setup system prompt: add extraction agent base behaviour context so journey LLM knows what is already handled and focuses on field mappings only; remove redundant data-types question (already known from user selection); derive data types list dynamically - Rewrite processing base prompt to use actual tool names (list_tasks, update_task, add_task_comment, list_notes, update_note, list_timelines, update_timeline, list_all_projects, create_project) and enforce update-first strategy before falling back to creation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 23:52:54 +01:00
Roberto Musso	826f64d6bb	refactor local directory agent to two-phase LLM-with-tools architecture Replace the single-pass FE-driven agent_run/agent_data flow with a BE-orchestrated two-phase execution using LangChain tool-calling: - Phase 1 (Triage): explores directory via new filesystem tools, matches files to existing projects using PROJECT_TOOLS - Phase 2 (Processing): reads files and performs CRUD per project group with clean LLM context windows Key changes: - Add filesystem_agent.py with list_directory, read_file_content, get_file_metadata tools using execute_on_client() - Move setup journey from REST to WebSocket (journey_start/message frames) - Add batch_runs_per_day billing limit and enforce in /trigger - Remove deprecated agent_data/agent_complete frame handlers and queues Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 08:50:46 +01:00
roberto	5faa6b1d7c	refactor agents to client-owned config flow	2026-03-16 22:35:46 +01:00
roberto	02a9684cd6	scope episodic memory enrichment by session_id	2026-03-16 00:33:11 +01:00
roberto	30b062dd4a	fix floating stream empty responses with sanitizer-safe fallbacks	2026-03-13 16:57:30 +01:00
roberto	2a0331d7ce	refactor floating_domain to structured object-only payload	2026-03-13 16:09:24 +01:00
roberto	13fd8677c1	fix: normalize home task/timeline responses to tag-only lines	2026-03-13 12:16:58 +01:00
roberto	9bd629cb59	chore: add interaction tracing and remove personal fields from logs	2026-03-13 10:23:47 +01:00
roberto	9c97702daa	feat: add letta-style memory tools with request/user debug tracing	2026-03-13 09:34:23 +01:00
roberto	a1e364c9c0	refactor: switch to single-agent deep runner and add mock memory/tool tests	2026-03-13 08:20:42 +01:00
roberto	5b55f1292a	make a single agent	2026-03-13 07:42:36 +01:00
roberto	5bc9ea6cd6	fix: make planner schema copilot-compatible and silence usage warning	2026-03-12 23:17:31 +01:00
roberto	f7404b6f66	refactor: move memory updates from synthesizer to orchestrator node	2026-03-12 23:03:38 +01:00
roberto	d667e43c73	refactor: use native LangGraph streaming and enforce structured summary on workers	2026-03-12 22:50:32 +01:00
roberto	fe085a7951	feat: migrate chat orchestration to deep langgraph workers	2026-03-12 22:25:36 +01:00
roberto	2de67213f8	rename from checkpoint to timeline agent	2026-03-10 23:17:38 +01:00
roberto	9332e29e53	bug fix sending component	2026-03-10 09:11:24 +01:00
roberto	34f01234c9	rename popup chat to floating chat	2026-03-08 22:53:31 +01:00
roberto	e6b5bc2e7d	step-7: add memory middleware (memory_middleware.py, device_ws.py) MemoryMiddleware class: - enrich_context(): loads core prefs, associative (top-k), episodic (last-N), and proactive hints (above 0.6 confidence) — all decrypted in-memory only - store_episode(): encrypts and persists interaction summary to memory_episodic - update_core(): upserts encrypted key/value to memory_core device_ws.py home_request + popup_request handlers: - enrich_context() called before orchestrate_v3_stream (memory injected into context) - store_episode() called after stream completes (non-blocking) 10 unit + integration tests pass; pre-existing test_agents.py failures unrelated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 22:14:28 +01:00
roberto	393b3befd6	step-4: add output formatting layer (output_formatter.py) HomeFormatter parses JSON block stream from orchestrator tokens and emits stream_start / stream_text / stream_block / stream_end frames. PopupFormatter emits popup_domain then plain stream_text. All 13 unit tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 21:51:20 +01:00
roberto	2c08275934	step-3: add router refactor with streaming support (orchestrator.py) - orchestrate_v3(user_id, message, context): classifies intent, returns (agent_name, agent_instance) — caller drives execution - orchestrate_v3_stream(user_id, message, context): yields (agent_name, token) pairs; first yield is always (agent_name, "") as a domain-detection signal - ChatAgent.handle_stream(): default implementation yields handle() result as one chunk; subclasses override for true token-level streaming - Fix stale test_orchestrator.py assertions that expected a JSON final frame that orchestrate_stream never emitted Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 21:42:46 +01:00
roberto	7cb384fa63	step-2: add agent streaming and tool result capture (agent_registry.py) - ChatAgent.__init__: adds tool_results: list[dict] = [] - _tool_loop: wraps execution in a result collector; populates self.tool_results with raw execute_on_client dicts after each run - _tool_loop_stream: streaming variant — uses ainvoke for tool-call iterations, llm.astream() for the final answer; same result capture - ws_context.py: adds _tool_result_collector ContextVar + set/clear helpers; execute_on_client appends to collector when set Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 21:37:15 +01:00
roberto	ac71d99f9a	add cerebras models	2026-03-08 00:53:25 +01:00
roberto	a775a2da18	feat(step-3.6): cloud provider integrations (Gmail, Outlook, Teams) - Add app/integrations/__init__.py: Fernet token encryption helpers, EmailMessage/ChatMessage dataclasses, get_provider() factory - Add app/integrations/gmail.py: GmailClient with async fetch_messages(), token refresh, configurable label/sender/date filters - Add app/integrations/ms_graph.py: MSGraphClient with fetch_emails() (Outlook) and fetch_messages() (Teams), MSAL token refresh, OData filters - Update app/core/agent_runner.py: replace run_cloud_agent() stub with full 8-step implementation; extend _finalize_run() for cloud config type - Update app/config/settings.py: add OAuth + Fernet encryption settings - Update requirements.txt: google-api-python-client, google-auth-*, msal, cryptography - Add tests/test_integrations.py: 47 tests covering all integration code - Update tests/test_agent_runner.py: replace stub test with 7 real tests All 76 new/updated tests pass.	2026-03-05 18:05:07 +01:00
roberto	914f70bd85	step 3.4 complete: agent run orchestrator — local/cloud runner + trigger_pending_runs + 23 tests	2026-03-05 16:13:21 +01:00
roberto	608d6c784f	step 3.3 complete: device WS endpoint + DeviceConnectionManager	2026-03-05 15:51:58 +01:00
roberto	6d9a16e513	steps B.3/B.4/B.5 complete: bidirectional WS handler, _tool_loop verified, clean final frame	2026-03-05 00:06:11 +01:00
roberto	27c087d5d8	step B.2 complete: all 23 tools use execute_on_client(); add embed() to llm	2026-03-05 00:03:01 +01:00
rmusso	4d7fd519c5	step B.1 complete: WS context + frame schemas	2026-03-04 23:59:31 +01:00
roberto	314780d59a	Add LLM configuration options and update deployment workflow - Introduced new API keys for Anthropic and Google in .env.example and settings.py - Updated llm.py to retrieve API keys directly from settings - Modified deploy.yaml to streamline code checkout and improve deployment process	2026-03-03 16:52:56 +01:00
roberto	8bfce9da00	Refactor LLM instantiation across agents and orchestrator - Replaced direct instantiation of ChatOpenAI with a centralized get_llm function in CheckpointAgent, NoteAgent, ProjectAgent, and TaskAgent. - Introduced a new llm.py module to handle LLM model instantiation and API key management. - Updated settings.py to include LLM_MODEL and LLM_ROUTER_MODEL configurations. - Modified orchestrator.py to use get_router_llm for intent classification. - Updated requirements.txt to include litellm for LLM management. - Adjusted tests to mock get_llm instead of ChatOpenAI directly.	2026-03-03 15:46:44 +01:00
roberto	c8ef7b119b	Refactor tests for execution plan and add comprehensive storage tests - Updated `TestModuleSingletons` in `test_execution_plan.py` to reflect new agent templates and playbook names. - Changed assertions in playbook tests to match updated templates and agents. - Introduced `test_storage.py` to cover the storage layer, including encryption, BlobStore, and VectorStore functionalities. - Added tests for S3 interactions, ensuring upload, download, delete, and list operations work as expected. - Implemented mock tests for Pinecone and Qdrant vector stores to validate upsert, search, and delete operations.	2026-03-02 15:36:09 +01:00
roberto	14d1a7351d	step 5 complete: execution plan builder, template registry, and LRU plan cache Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-02 13:13:02 +01:00
roberto	68955d2fc2	step 4 complete: intelligent routing with single-agent and pipeline modes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-02 13:03:54 +01:00
roberto	0d16729036	step 3 complete: pluggable agent framework Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 00:03:42 +01:00
roberto	4d0917f5df	step 1 complete: runnable FastAPI skeleton - Full directory structure with all __init__.py stubs - requirements.txt with all pinned dependencies - app/config/settings.py (BaseSettings, env-based) - app/main.py (CORS, lifespan, /api/v1/health) - Dockerfile (multi-stage, Python 3.12-slim, non-root user) - docker-compose.yml (app + postgres:16 with healthcheck) - .env.example - BACKEND_PLAN.md: mark step 1 done, add one-step-at-a-time rule Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 23:51:37 +01:00

48 Commits