adiuvAI/api - api - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Roberto Musso	57b5648915	feat(billing): extract Billing Service (Step 4) - stripe_service: checkout sessions, webhook handling, subscription CRUD - tier_manager: feature matrix (4 tiers), quota enforcement, rate limits - routes: checkout, webhook (no auth), subscription, tier query, features - Traefik header auth (X-User-Id) replaces get_current_user dependency - /tier/{user_id} endpoint for internal service-to-service lookups - /features and /features/{tier} for feature matrix queries - Dockerfile: single worker, 30s timeout (lightweight service)	2026-04-06 23:07:46 +02:00
Roberto Musso	7e4374c69b	feat(eval): add custom system prompt support for step-1 classification	2026-04-06 22:56:30 +02:00
Roberto Musso	fe0dd038ee	fix: Langfuse SDK v4 migration, tracing improvements, and LLM config - Langfuse SDK v4: fix prompt-to-trace linking (as_type=generation) - tracing: compile_prompt with Langfuse managed prompt fallback - journey: remove journey CLI subcommand (keep only interactive) - LLM: add service-specific llm modules for batch-agent and chat - gitignore: exclude eval private test data - config: add LANGFUSE settings to shared config	2026-03-24 16:25:51 +01:00
Roberto Musso	d3f7099d93	refactor(eval): 3-mode eval harness (step1/step2/full) with Langfuse fixes - Rewrite eval config with EvalMode (step1, step2, full) replacing prompt_variants - Rewrite runner with _run_step1, _run_step2, _run_full dispatch - CLI: replace --variants with --mode flag - Add 3 fixture YAMLs: classify_invoices (step1), process_invoices (step2), full_invoices (full) - Remove old freelance_invoices fixture - Langfuse: mode-aware dataset items (classifications for step1, extraction for step2, both for full) - Langfuse: link both prompts (batch_file_classifier + batch_processing) in full mode - Langfuse: post separate classification_precision/recall/f1 scores for full mode - Langfuse: skip misleading field_accuracy=0 when field_scores is empty (step1) - Langfuse: include step1_results in trace output - MockExecutor: mock async_session to bypass DB in full mode - Journey fixture: remove user_messages (only interactive test kept)	2026-03-24 16:18:51 +01:00
Roberto Musso	63fa119543	feat(batch-agent): add journey eval to E2E harness - journey_runner.py: orchestrates journey start → simulated user messages → template extraction → LLM judge scoring - config.py: JourneyFixture dataclass with user_messages and expected_template_criteria, discover_journey_fixtures() - langfuse_eval.py: sync_journey_fixture_to_dataset() - cli.py: new 'journey' subcommand (python -m eval journey) with --fixture, --models, --judge-model flags - fixtures/journey_invoice_setup.yaml: example journey fixture with 4 user messages and 8 quality criteria	2026-03-23 23:16:41 +01:00
Roberto Musso	d856dfd28c	refactor: deduplicate shared code into shared/ module Move duplicated files from chat + batch-agent into shared/: - shared/ws_context.py — Redis-based tool call round-trip - shared/llm.py — LiteLLM factory (get_llm, embed) - shared/agents/ — 4 domain agents (task, note, project, timeline) Update all service imports to use shared.* instead of app.*. Delete 12 duplicated files across both services.	2026-03-23 23:01:45 +01:00
Roberto Musso	ccba54ac24	fix(tracing): use Langfuse compile_prompt with {{variable}} syntax - tracing.py: add compile_prompt() that uses Langfuse .compile(**vars) for {{variable}} substitution, falls back to Python .format() for hardcoded {variable} templates - agent_runner.py: replace _get_system_prompt().format() with tracing.compile_prompt() for batch_file_classifier, batch_processing, batch_cloud_processing prompts - journey.py: replace get_prompt + .format() with compile_prompt() for journey_system prompt - chat tracing.py: add compile_prompt() for parity (chat prompts currently have no variables, but ready for future use) - Remove unused _get_system_prompt helper	2026-03-23 22:39:27 +01:00
Roberto Musso	55500cc818	feat(batch-agent): add Langfuse prompt management - _get_system_prompt helper: fetches managed prompts from Langfuse with hardcoded fallback (same pattern as chat service) - journey.py: journey_system prompt manageable via Langfuse - agent_runner.py: batch_file_classifier, batch_processing, batch_cloud_processing prompts all manageable via Langfuse - redis_consumer.py: link_prompt_to_trace for all three handlers	2026-03-23 22:30:36 +01:00
Roberto Musso	75a826c9d8	feat(batch-agent): add E2E evaluation harness with Langfuse integration - eval/mock_executor.py: intercepts execute_on_client, serves fixture files from disk, records all mutations (insert/update/delete) - eval/config.py: YAML fixture loader with prompt variants, expected results, seed records, model overrides - eval/scorer.py: FieldMatchScorer (fuzzy title match, per-field accuracy, precision/recall/F1) + LLMJudgeScorer (semantic eval) - eval/langfuse_eval.py: sync fixtures to Langfuse datasets, create dataset runs, post scores, link traces to runs - eval/runner.py: orchestrates fixture → mock → agent pipeline → scoring → Langfuse reporting - eval/cli.py: CLI (python -m eval run/list/sync) with --models, --variants, --fixture, --no-judge flags - eval/fixtures/: example Italian freelance scenario with 3 prompt variants (baseline, detailed_italian, minimal)	2026-03-23 08:54:19 +01:00
Roberto Musso	971f1dd84f	feat(batch-agent): integrate Langfuse tracing - tracing.py: init/shutdown, trace_span, get_langfuse_callback, prompt mgmt - main.py: init_langfuse at startup, shutdown on teardown - redis_consumer.py: trace_span around journey_start/message/agent_trigger - agent_runner.py: thread langfuse_handler through classify + processing LLM - journey.py: thread langfuse_handler through _call_llm_with_tools - llm.py: accept callbacks param, forward to LLM constructors - requirements.txt: add langfuse>=3.0.0	2026-03-23 08:43:15 +01:00
Roberto Musso	333bba6fdd	feat(batch-agent): extract Batch Agent Service (Step 3) - agent_runner: local directory + cloud agent orchestration via Redis - 5 domain agents: filesystem, task, note, project, timeline - integrations: Gmail, MS Graph (Outlook + Teams) - journey: guided chatbot conversation to build prompt_template - routes: REST endpoints (catalog, can-create, trigger) - redis_consumer: subscribes to batch:request:* pattern - ws_context: Redis-based execute_on_client for tool round-trip - Dockerfile with 300s timeout for long-running batch jobs	2026-03-23 07:19:02 +01:00
Roberto Musso	229e20d073	docs: add Langfuse integration TODO for batch-agent service	2026-03-23 00:25:42 +01:00
Roberto Musso	0b491b3643	fix: langfuse v4 SDK compatibility and pass user message as trace input	2026-03-23 00:23:59 +01:00
Roberto Musso	0d5fa3e569	feat(chat): integrate Langfuse tracing, prompt management & generation tracking - shared/config.py: add LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, LANGFUSE_HOST - services/chat/app/tracing.py: new module — Langfuse client singleton, create_trace(), get_langfuse_callback(), get_prompt(), link_prompt_to_trace(), score_trace(), flush/shutdown helpers. Gracefully no-ops when keys are missing. - services/chat/app/llm.py: add callbacks param to get_llm() for LangChain callback handler injection - services/chat/app/deep_agent.py: accept langfuse_handler in all run_* and _run_single_agent* functions, pipe callbacks to LLM calls, fetch managed prompts from Langfuse with fallback to hardcoded system prompts - services/chat/app/redis_consumer.py: create Langfuse trace per request (home_request/floating_request), pass callback handler to deep_agent, link prompt name to trace, attach output preview, flush after each request - services/chat/app/main.py: shutdown Langfuse client in lifespan teardown - services/chat/requirements.txt: add langfuse>=2.0.0 Langfuse prompt names: 'home_system', 'floating_system' — create these in the Langfuse dashboard to manage prompts. Without them, hardcoded defaults are used transparently.	2026-03-22 23:15:04 +01:00
Roberto Musso	aff68a9051	fix: shared config loads root .env as fallback for microservices	2026-03-22 22:42:54 +01:00
Roberto Musso	5e9ef2809e	fix: add extra=ignore to monolith Settings for strangler fig compat	2026-03-22 22:28:50 +01:00
Roberto Musso	90018af311	feat: add WS Gateway and Chat Service (Step 2) WS Gateway: - WebSocket lifecycle handler with RS256 JWT auth - Redis bridge: device registry, frame publishing, tool_result routing - Inbound routing: tool_result→LPUSH, home/floating→chat pub/sub - Outbound: subscribes to ws:out:{user_id}, forwards to Electron - Single-worker Dockerfile (long-lived WS connections) Chat Service: - Redis consumer: subscribes to chat:request:* pattern - Redis-based ws_context: tool_call→publish, BRPOP tool_result (30s timeout) - deep_agent: single-agent runner with home/floating/stream variants - memory_middleware: core/associative/episodic/proactive memory with Fernet - Domain agents: task (8 tools), note (5), project (6), timeline (4) - LLM factory via LiteLLM (100+ providers) - Output formatter (StreamFormatter) - POST /chat REST fallback with Traefik header auth - Multi-worker Dockerfile with 120s timeout for LLM calls	2026-03-22 01:20:11 +01:00
Roberto Musso	1e2e395676	fix: PEM newline parsing + shared config extra=ignore - Add field_validator to expand literal \n in PEM keys (auth config + shared config) - Set extra='ignore' on shared Settings so service-specific .env vars don't cause ValidationError - Add *.pem to .gitignore	2026-03-22 01:03:28 +01:00
Roberto Musso	59d3a53980	chore: update .env.example files for RS256 + Redis - Root .env.example: replace JWT_SECRET/JWT_ALGORITHM with JWT_PUBLIC_KEY, add REDIS_URL - Auth Service .env.example: JWT_PRIVATE_KEY + JWT_PUBLIC_KEY with generation instructions	2026-03-22 00:51:54 +01:00
Roberto Musso	9feeaa79c8	feat(auth): migrate JWT from HS256 to RS256 - Add services/auth/app/config.py with JWT_PRIVATE_KEY and JWT_PUBLIC_KEY (Auth Service local config - private key never leaves this service) - Update routes.py: sign tokens with RS256 private key - Update deps.py + verify.py: verify tokens with RS256 public key - Update shared/config.py: replace JWT_SECRET/JWT_ALGORITHM with JWT_PUBLIC_KEY (for optional local verification by other services) - Add sys.path fix in main.py for local dev without PYTHONPATH	2026-03-22 00:50:36 +01:00
Roberto Musso	aa219a4d08	feat: microservices scaffold + Auth Service (Step 1) - Add shared/ module: config, db, models, schemas, redis utilities - Add Auth Service (services/auth/): register, login, refresh, me, ForwardAuth /verify endpoint for Traefik - Add Traefik config: ACME/Cloudflare DNS-01, dynamic routing, ForwardAuth middleware, sticky sessions for WS Gateway - Add service scaffolds: ws-gateway, chat, batch-agent, billing (READMEs) - Add redis>=5.0.0 to requirements.txt - Monolith app/ is untouched — strangler fig migration	2026-03-22 00:29:51 +01:00
Roberto Musso	552b8eb305	Fix project creation: code-based in runner, not delegated to Step 2 LLM Root causes fixed: 1. PROJECT_TOOLS removed from Step 2 tool set — project assignment is now exclusively handled by the runner in code, never by the LLM. 2. When Step 1 returns "new", runner calls execute_on_client insert/projects directly (before Step 2), gets the created id, and passes it as context. 3. Newly created projects are appended to the local `projects` list so that subsequent files in the same run can match to them via Step 1 — prevents one project per file when multiple files share the same topic. Also add tests/test_classify_file.py with pytest cases for _classify_file and a CLI runner: python -m tests.test_classify_file <file> [project...] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-21 23:40:38 +01:00
Roberto Musso	0d93b3960d	Exclude project/projectId questions from agent setup journey - Add explicit MUST NOT instruction: never ask about projects, projectId, or how to link records; project assignment is handled by the agent runner - Remove projectId from template field list; remove projects from entity types - Remove stale isApproved=0 reference (already removed from the data model) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-21 22:58:05 +01:00
Roberto Musso	f07580574b	Replace max_turns cap with 90% confidence stopping criterion in agent setup - Remove fixed _MAX_TURNS=5 instruction from system prompt; LLM now decides when to stop based on self-assessed confidence (>= 90%) - Add _MIN_TURNS_BEFORE_NUDGE=3 and raise safety cap to _MAX_TURNS=15 - Nudge message and hard cap still act as a safety net for infinite loops Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-21 22:54:34 +01:00
Roberto Musso	1a8bf11f90	update migration plan	2026-03-20 23:48:36 +01:00
Roberto Musso	e7cdce8287	Improve Step 1 project matching and Step 2 update-first enforcement - Rewrite _STEP1_SYSTEM_PROMPT: lower matching threshold (no longer requires "clear" match), strongly prefer existing projects over creating new ones, use structured id=\|name=\|status= format with aiSummary for richer context - Add code-level UUID validation: reject hallucinated ids not in the fetched projects list, fall back to "new" instead of creating a bad link - Rewrite _PROCESSING_SYSTEM_PROMPT: enforce explicit scan-before-create process (read existing → search → update if found → create only if not) with hard rule against calling create_* without checking existing records Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 23:45:29 +01:00
Roberto Musso	58bc6efd4b	Rewrite run_local_agent: code-based flow, concurrency guard, remove isApproved - Replace LLM-driven triage with code-based directory scan and project fetch - Two-step LLM approach: Step 1 classifies file→project+domains, Step 2 processes with tools - Add domain descriptions to Step 1 prompt for better extraction accuracy - Add _running_agents set for per-agent concurrency guard (one running instance per agent) - Return 409 from route before DB write when agent already running - Remove is_approved from task_agent create/update tools and system prompt - Remove is_approved from timeline_agent create/update tools and system prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 22:21:30 +01:00
Roberto Musso	6c450805cb	possibile evoluzione	2026-03-20 20:57:03 +01:00
Roberto Musso	f340d0fa3e	Fix dev tier: default to power when no subscription exists The tier is resolved live from the subscriptions table in get_current_user. Previously fell back to 'free' unconditionally, hitting the 5 runs/day limit immediately in dev. Now falls back to 'power' (unlimited) when ENV=dev and no subscription row exists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 12:32:36 +01:00
Roberto Musso	edc53cb6eb	Default to power tier (unlimited) in dev when no subscription exists Users without a subscription row in dev get power tier so rate limits and quota checks don't block local development. In prod the fallback remains free tier as before. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 12:12:43 +01:00
Roberto Musso	725cece5c1	Add run_context to agent tool calls for FE run logging - AgentTriggerRequest accepts optional agent_id (FE's stable electron-store UUID) - _make_agent_executor injects run_context into every tool_call frame so Electron can attribute actions to the correct agent run - run_local_agent accepts run_context and sends a run_complete WS frame when the run finishes so the FE can close the run record - trigger_agent_run builds run_context with run_id=run_log.id and the stable agent_id, passes it through to run_local_agent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-20 09:46:17 +01:00
Roberto Musso	297e20ce8d	Fix 422 on agent trigger: accept plural data type names AgentTriggerRequest.what_to_extract now accepts list[str] instead of strict Literal values. _to_data_types normalises all FE variants (tasks/task, notes/note, timelines/timeline/timelineEvents, projects/project) to the canonical plural form the runner expects, with deduplication. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 00:04:29 +01:00
Roberto Musso	5a03bd1cfb	Clean up agent catalog and improve extraction agent prompts - Remove unused config_schema from AgentCatalogItem (schema + route) - Fix agent_setup system prompt: add extraction agent base behaviour context so journey LLM knows what is already handled and focuses on field mappings only; remove redundant data-types question (already known from user selection); derive data types list dynamically - Rewrite processing base prompt to use actual tool names (list_tasks, update_task, add_task_comment, list_notes, update_note, list_timelines, update_timeline, list_all_projects, create_project) and enforce update-first strategy before falling back to creation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 23:52:54 +01:00
Roberto Musso	87b7a1c6c9	fix journey setup: honor FE session_id, seed LLM history, and force template on max turns - Use session_id from the FE frame so replies match the listener key - Seed conversation with a user message for LLM provider compatibility - On max turns, nudge the LLM and immediately re-invoke to force prompt_template generation instead of deferring to next message - Fix display_message extraction to safely check for template markers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 16:25:53 +01:00
Roberto Musso	826f64d6bb	refactor local directory agent to two-phase LLM-with-tools architecture Replace the single-pass FE-driven agent_run/agent_data flow with a BE-orchestrated two-phase execution using LangChain tool-calling: - Phase 1 (Triage): explores directory via new filesystem tools, matches files to existing projects using PROJECT_TOOLS - Phase 2 (Processing): reads files and performs CRUD per project group with clean LLM context windows Key changes: - Add filesystem_agent.py with list_directory, read_file_content, get_file_metadata tools using execute_on_client() - Move setup journey from REST to WebSocket (journey_start/message frames) - Add batch_runs_per_day billing limit and enforce in /trigger - Remove deprecated agent_data/agent_complete frame handlers and queues Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 08:50:46 +01:00
roberto	5faa6b1d7c	refactor agents to client-owned config flow	2026-03-16 22:35:46 +01:00
roberto	02a9684cd6	scope episodic memory enrichment by session_id	2026-03-16 00:33:11 +01:00
roberto	fae9efee0d	removed old plan files	2026-03-13 16:58:43 +01:00
roberto	30b062dd4a	fix floating stream empty responses with sanitizer-safe fallbacks	2026-03-13 16:57:30 +01:00
roberto	2a0331d7ce	refactor floating_domain to structured object-only payload	2026-03-13 16:09:24 +01:00
roberto	13fd8677c1	fix: normalize home task/timeline responses to tag-only lines	2026-03-13 12:16:58 +01:00
roberto	9bd629cb59	chore: add interaction tracing and remove personal fields from logs	2026-03-13 10:23:47 +01:00
roberto	9c97702daa	feat: add letta-style memory tools with request/user debug tracing	2026-03-13 09:34:23 +01:00
roberto	a1e364c9c0	refactor: switch to single-agent deep runner and add mock memory/tool tests	2026-03-13 08:20:42 +01:00
roberto	5b55f1292a	make a single agent	2026-03-13 07:42:36 +01:00
roberto	5bc9ea6cd6	fix: make planner schema copilot-compatible and silence usage warning	2026-03-12 23:17:31 +01:00
roberto	f7404b6f66	refactor: move memory updates from synthesizer to orchestrator node	2026-03-12 23:03:38 +01:00
roberto	d667e43c73	refactor: use native LangGraph streaming and enforce structured summary on workers	2026-03-12 22:50:32 +01:00
roberto	fe085a7951	feat: migrate chat orchestration to deep langgraph workers	2026-03-12 22:25:36 +01:00
roberto	2de67213f8	rename from checkpoint to timeline agent	2026-03-10 23:17:38 +01:00

1 2 3

103 Commits