Roberto Musso
ccba54ac24
fix(tracing): use Langfuse compile_prompt with {{variable}} syntax
...
- tracing.py: add compile_prompt() that uses Langfuse .compile(**vars)
for {{variable}} substitution, falls back to Python .format() for
hardcoded {variable} templates
- agent_runner.py: replace _get_system_prompt().format() with
tracing.compile_prompt() for batch_file_classifier, batch_processing,
batch_cloud_processing prompts
- journey.py: replace get_prompt + .format() with compile_prompt()
for journey_system prompt
- chat tracing.py: add compile_prompt() for parity (chat prompts
currently have no variables, but ready for future use)
- Remove unused _get_system_prompt helper
2026-03-23 22:39:27 +01:00
Roberto Musso
55500cc818
feat(batch-agent): add Langfuse prompt management
...
- _get_system_prompt helper: fetches managed prompts from Langfuse
with hardcoded fallback (same pattern as chat service)
- journey.py: journey_system prompt manageable via Langfuse
- agent_runner.py: batch_file_classifier, batch_processing,
batch_cloud_processing prompts all manageable via Langfuse
- redis_consumer.py: link_prompt_to_trace for all three handlers
2026-03-23 22:30:36 +01:00
Roberto Musso
75a826c9d8
feat(batch-agent): add E2E evaluation harness with Langfuse integration
...
- eval/mock_executor.py: intercepts execute_on_client, serves fixture
files from disk, records all mutations (insert/update/delete)
- eval/config.py: YAML fixture loader with prompt variants, expected
results, seed records, model overrides
- eval/scorer.py: FieldMatchScorer (fuzzy title match, per-field
accuracy, precision/recall/F1) + LLMJudgeScorer (semantic eval)
- eval/langfuse_eval.py: sync fixtures to Langfuse datasets, create
dataset runs, post scores, link traces to runs
- eval/runner.py: orchestrates fixture → mock → agent pipeline →
scoring → Langfuse reporting
- eval/cli.py: CLI (python -m eval run/list/sync) with --models,
--variants, --fixture, --no-judge flags
- eval/fixtures/: example Italian freelance scenario with 3 prompt
variants (baseline, detailed_italian, minimal)
2026-03-23 08:54:19 +01:00
Roberto Musso
971f1dd84f
feat(batch-agent): integrate Langfuse tracing
...
- tracing.py: init/shutdown, trace_span, get_langfuse_callback, prompt mgmt
- main.py: init_langfuse at startup, shutdown on teardown
- redis_consumer.py: trace_span around journey_start/message/agent_trigger
- agent_runner.py: thread langfuse_handler through classify + processing LLM
- journey.py: thread langfuse_handler through _call_llm_with_tools
- llm.py: accept callbacks param, forward to LLM constructors
- requirements.txt: add langfuse>=3.0.0
2026-03-23 08:43:15 +01:00
Roberto Musso
333bba6fdd
feat(batch-agent): extract Batch Agent Service (Step 3)
...
- agent_runner: local directory + cloud agent orchestration via Redis
- 5 domain agents: filesystem, task, note, project, timeline
- integrations: Gmail, MS Graph (Outlook + Teams)
- journey: guided chatbot conversation to build prompt_template
- routes: REST endpoints (catalog, can-create, trigger)
- redis_consumer: subscribes to batch:request:* pattern
- ws_context: Redis-based execute_on_client for tool round-trip
- Dockerfile with 300s timeout for long-running batch jobs
2026-03-23 07:19:02 +01:00
Roberto Musso
229e20d073
docs: add Langfuse integration TODO for batch-agent service
2026-03-23 00:25:42 +01:00
Roberto Musso
aa219a4d08
feat: microservices scaffold + Auth Service (Step 1)
...
- Add shared/ module: config, db, models, schemas, redis utilities
- Add Auth Service (services/auth/): register, login, refresh, me,
ForwardAuth /verify endpoint for Traefik
- Add Traefik config: ACME/Cloudflare DNS-01, dynamic routing,
ForwardAuth middleware, sticky sessions for WS Gateway
- Add service scaffolds: ws-gateway, chat, batch-agent, billing (READMEs)
- Add redis>=5.0.0 to requirements.txt
- Monolith app/ is untouched — strangler fig migration
2026-03-22 00:29:51 +01:00