Go to file

Roberto Musso 0d5fa3e569 feat(chat): integrate Langfuse tracing, prompt management & generation tracking

- shared/config.py: add LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, LANGFUSE_HOST
- services/chat/app/tracing.py: new module — Langfuse client singleton,
  create_trace(), get_langfuse_callback(), get_prompt(), link_prompt_to_trace(),
  score_trace(), flush/shutdown helpers. Gracefully no-ops when keys are missing.
- services/chat/app/llm.py: add callbacks param to get_llm() for LangChain
  callback handler injection
- services/chat/app/deep_agent.py: accept langfuse_handler in all run_* and
  _run_single_agent* functions, pipe callbacks to LLM calls, fetch managed
  prompts from Langfuse with fallback to hardcoded system prompts
- services/chat/app/redis_consumer.py: create Langfuse trace per request
  (home_request/floating_request), pass callback handler to deep_agent,
  link prompt name to trace, attach output preview, flush after each request
- services/chat/app/main.py: shutdown Langfuse client in lifespan teardown
- services/chat/requirements.txt: add langfuse>=2.0.0

Langfuse prompt names: 'home_system', 'floating_system' — create these in
the Langfuse dashboard to manage prompts. Without them, hardcoded defaults
are used transparently.

2026-03-22 23:15:04 +01:00

.gitea/workflows

feat: deploy via SSH with port 8080, idempotent migrations

2026-03-03 22:10:03 +01:00

.github/workflows

Step 13 - completed

2026-03-03 15:14:04 +01:00

alembic

refactor agents to client-owned config flow

2026-03-16 22:35:46 +01:00

app

fix: add extra=ignore to monolith Settings for strangler fig compat

2026-03-22 22:28:50 +01:00

docs

update migration plan

2026-03-20 23:48:36 +01:00

services

feat(chat): integrate Langfuse tracing, prompt management & generation tracking

2026-03-22 23:15:04 +01:00

shared

feat(chat): integrate Langfuse tracing, prompt management & generation tracking

2026-03-22 23:15:04 +01:00

tests

Fix project creation: code-based in runner, not delegated to Step 2 LLM

2026-03-21 23:40:38 +01:00

traefik

feat: add WS Gateway and Chat Service (Step 2)

2026-03-22 01:20:11 +01:00

.env.example

chore: update .env.example files for RS256 + Redis

2026-03-22 00:51:54 +01:00

.gitignore

fix: PEM newline parsing + shared config extra=ignore

2026-03-22 01:03:28 +01:00

alembic.ini

step 12

2026-03-03 12:39:32 +01:00

docker-compose.yml

update alembic

2026-03-08 23:17:01 +01:00

Dockerfile

Step 13 - completed

2026-03-03 15:14:04 +01:00

logging.conf

bug fix sending component

2026-03-10 09:11:24 +01:00

README.md

rename from checkpoint to timeline agent

2026-03-10 23:17:38 +01:00

requirements.txt

feat: microservices scaffold + Auth Service (Step 1)

2026-03-22 00:29:51 +01:00

README.md

Adiuva Cloud API

AI-powered project management backend with E2E encrypted cloud storage, LLM orchestration, and a plugin marketplace.

Built with FastAPI · Python 3.12 · PostgreSQL · LangChain · Stripe · AWS S3

Overview
Architecture
Key Features
Tech Stack
Getting Started
Docker Deployment
Environment Variables
API Reference
Data Model
AI Agent System
Orchestration & Execution Plans
Middleware
Storage Layer
Billing & Tiers
Plugin Marketplace
Testing
Project Structure
License

Overview

Adiuva Cloud API is the FastAPI backend that powers the Adiuva Electron desktop app. It provides LLM-powered chat orchestration, end-to-end encrypted cloud storage, a vector search engine, an encrypted backup system, a plugin marketplace with revenue sharing, and Stripe-based subscription billing across four tiers.

Design Principles

Never persist user data in plaintext — the database stores only auth, billing, storage metadata, and marketplace data. All user content is E2E encrypted by the client before reaching the server.
Never expose prompts — system prompts stay server-side; responses are sanitized to strip any leaked prompt fragments.
Never decrypt user blobs — the backend performs only checksum verification; no decryption keys ever reach the server.
Stateless request handling — all context comes from the client and JWT; no server-side session state.
Tier gates enforced server-side — the server always reads the current tier from the database, never trusting client-reported values.

Architecture

┌──────────────┐      ┌────────────────────────────────────────────────────────┐
│  Electron    │      │  FastAPI  (Uvicorn / Gunicorn)                         │
│  Desktop App │────▶│                                                        │
│  (Client)    │◀────│  Middleware: RateLimit → Sanitizer → CORS → Router     │
└──────────────┘      │                                                        │
                      │  ┌──────────────────┐  ┌────────────────────────────┐  │
                      │  │  Auth Routes     │  │  Chat Routes               │  │
                      │  │  Billing Routes  │  │    ↓                       │  │
                      │  │  Storage Routes  │  │  Orchestrator (GPT-4o-mini)│  │
                      │  │  Backup Routes   │  │    ↓ classify intent       │  │
                      │  │  Plugin Routes   │  │  Agent Registry            │  │
                      │  │  Vector Routes   │  │    ↓                       │  │
                      │  │  Plans Routes    │  │  TaskAgent  | ProjectAgent │  │
                      │  └──────────────────┘  │  NoteAgent  | CheckptAgent │  │
                      │                        │  (GPT-4o + LangChain)      │  │
                      │                        └────────────────────────────┘  │
                      └────────────────────────────────────────────────────────┘
                               │              │              │
                      ┌────────▼───┐  ┌───────▼───────┐  ┌──▼─────────────┐
                      │ PostgreSQL │  │  AWS S3       │  │ Pinecone /     │
                      │ (Auth,     │  │  (E2E blobs,  │  │ Qdrant         │
                      │  Billing,  │  │   backups)    │  │ (Vectors)      │
                      │  Metadata) │  └───────────────┘  └────────────────┘
                      └────────────┘
                               │
                      ┌────────▼───┐
                      │  Stripe    │
                      │  (Billing, │
                      │   Connect) │
                      └────────────┘

Key Features

LLM-powered orchestration — GPT-4o-mini classifies user intent and routes to the appropriate domain agent.
4 specialized AI agents — Tasks (8 tools), Projects (6 tools), Timelines (4 tools), Notes (5 tools), all powered by GPT-4o via LangChain.
Execution plans & playbooks — Server-side prompt template registry; clients receive only opaque template IDs, never raw prompts.
E2E encrypted cloud storage — The backend never decrypts user data; SHA-256 checksum verification uses constant-time comparison to prevent timing attacks.
Cloud vector store — Pinecone or Qdrant with user-isolated namespaces and encrypted blob payloads.
Encrypted backup system — Tiered storage limits with If-Modified-Since support for efficient syncing.
Plugin marketplace — Catalog, admin review/approval workflow, security checklist, and 70/30 revenue sharing via Stripe Connect.
Stripe billing — Four-tier subscription model (Free / Pro / Power / Team) with checkout sessions and full webhook lifecycle handling.
JWT authentication — Access + refresh tokens with bcrypt password hashing, SHA-256 token hashing, and automatic rotation.
Prompt IP protection — Sanitizer middleware strips system prompts, reasoning markers, tool schemas, and agent routing metadata from all chat responses.
Tier-based rate limiting — Sliding-window per-user limiter scaling from 20 to 200 requests/min by subscription tier.
Zero-trust data model — User content is never stored in plaintext; the database holds only authentication, billing, and metadata records.
WebSocket streaming — Real-time chat with 30-second heartbeat keep-alive and chunked text delivery.
Alembic migrations — Versioned schema management with seed data for the plugin marketplace.
Comprehensive test suite — In-memory SQLite + moto S3 mocks, per-tier test fixtures, and full API coverage without external dependencies.

Tech Stack

Package	Version	Purpose
`fastapi`	≥ 0.115.0	Web framework
`uvicorn[standard]`	≥ 0.34.0	ASGI development server
`gunicorn`	≥ 22.0.0	Production process manager
`langchain`	≥ 0.3.0	LLM orchestration framework
`langchain-openai`	≥ 0.3.0	OpenAI LLM provider integration
`litellm`	≥ 1.50.0	Universal LLM gateway (100+ providers)
`pydantic`	≥ 2.10.0	Data validation and serialization
`pydantic-settings`	≥ 2.7.0	Environment-based configuration
`python-jose[cryptography]`	≥ 3.3.0	JWT encoding and decoding
`stripe`	≥ 11.0.0	Billing and payment integration
`boto3`	≥ 1.35.0	AWS S3 client
`slowapi`	≥ 0.1.9	Rate limiting utilities
`sqlalchemy`	≥ 2.0.0	Async ORM and query builder
`asyncpg`	≥ 0.30.0	PostgreSQL async driver
`alembic`	≥ 1.14.0	Database migration management
`bcrypt`	≥ 4.2.0	Password hashing
`python-dotenv`	≥ 1.0.0	`.env` file loading
`httpx`	≥ 0.28.0	Async HTTP client (used in tests)
`websockets`	≥ 14.0	WebSocket protocol support
`psycopg2-binary`	≥ 2.9.0	Synchronous PostgreSQL driver (Alembic)
`pinecone`	≥ 5.0.0	Pinecone vector store client
`qdrant-client`	≥ 1.7.0	Qdrant vector store client
`pytest`	≥ 8.0.0	Test framework
`pytest-asyncio`	≥ 0.24.0	Async test support
`aiosqlite`	≥ 0.20.0	In-memory SQLite for tests
`moto[s3]`	≥ 5.0.0	AWS S3 mock for tests
`ruff`	≥ 0.8.0	Linter and formatter

Getting Started

Prerequisites

Python 3.12+
PostgreSQL 16+
An OpenAI API key (for LLM features)
Stripe API keys (optional — billing stubs gracefully when unconfigured)
AWS credentials (optional — needed for S3 storage in production)

Installation

# Clone the repository
git clone <repo-url> && cd adiuva-api

# Create a virtual environment
python -m venv .venv && source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your DATABASE_URL, OPENAI_API_KEY, etc.

Database Setup

# Start PostgreSQL (or use the Docker Compose database)
docker compose up db -d

# Run migrations
alembic upgrade head

Run the Development Server

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Interactive API docs are available at http://localhost:8000/docs in development mode (ENV=dev). The /docs endpoint is disabled in production.

Docker Deployment

Quick Start

docker compose up --build

This starts two services:

app — FastAPI server on port 8000
db — PostgreSQL 16 (Alpine) on port 5432 with a persistent volume and health checks

The compose file also includes optional services for fully local deployments:

minio — S3-compatible object storage on ports 9000 (API) and 9001 (console)
qdrant — Vector search engine on ports 6333 (HTTP) and 6334 (gRPC)

Dockerfile Details

The Dockerfile uses a multi-stage build:

Builder stage — Installs Python dependencies into a virtual environment.
Runtime stage — Copies only the venv, app source, and Alembic migrations. Runs as a non-root user (appuser).
Production server — Gunicorn with 4 Uvicorn workers, 120-second timeout, listening on port 8000.

# Production command (run by the container)
gunicorn app.main:app -k uvicorn.workers.UvicornWorker -w 4 --timeout 120 -b 0.0.0.0:8000

Homelab / Self-Hosted Deployment

You can run the entire stack locally on a homelab with no cloud dependencies except the LLM provider. The compose file includes MinIO (S3 replacement) and Qdrant (vector store) out of the box.

1. Start all services

docker compose up -d

This starts PostgreSQL, MinIO, and Qdrant alongside the app.

2. Create the MinIO bucket

Open the MinIO console at http://localhost:9001 (login: minioadmin / minioadmin) and create a bucket named adiuva, or use the CLI:

docker compose exec minio mc alias set local http://localhost:9000 minioadmin minioadmin
docker compose exec minio mc mb local/adiuva

3. Configure your `.env`

# Database (uses the compose PostgreSQL)
DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/adiuva

# S3 → MinIO
S3_BUCKET=adiuva
S3_REGION=us-east-1
S3_ENDPOINT_URL=http://minio:9000
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin

# Vector store → local Qdrant (leave PINECONE_API_KEY empty)
QDRANT_URL=http://qdrant:6333
QDRANT_API_KEY=
PINECONE_API_KEY=

# Billing — leave empty to stub (no Stripe needed)
STRIPE_SECRET_KEY=
STRIPE_WEBHOOK_SECRET=

# LLM — the only external service
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o
LLM_ROUTER_MODEL=gpt-4o-mini

# Auth
JWT_SECRET=your-secret-here
ENV=dev

4. Run migrations

docker compose exec app alembic upgrade head

What runs where

Service	Runs on	Port	Notes
FastAPI app	Docker	8000	API server
PostgreSQL	Docker	5432	Auth, billing, metadata
MinIO	Docker	9000 / 9001	S3-compatible blob & backup storage
Qdrant	Docker	6333 / 6334	Vector search (replaces Pinecone)
Stripe	—	—	Stubbed when keys are empty
OpenAI / LLM	Cloud	—	Only external dependency

Want fully offline AI too? Set LLM_MODEL=ollama/llama3 and LLM_ROUTER_MODEL=ollama/llama3, then add an Ollama container or point at a local Ollama instance. See the LLM provider switching section.

Environment Variables

All variables are loaded from a .env file via Pydantic Settings. Source: app/config/settings.py

Variable	Type	Default	Description
`DATABASE_URL`	`str`	`postgresql+asyncpg://postgres:postgres@localhost:5432/adiuva`	Async SQLAlchemy connection string
`JWT_SECRET`	`str`	`change-me-in-production`	HMAC secret for JWT signing
`JWT_ALGORITHM`	`str`	`HS256`	JWT signing algorithm
`JWT_ACCESS_TOKEN_EXPIRE_MINUTES`	`int`	`30`	Access token time-to-live
`JWT_REFRESH_TOKEN_EXPIRE_DAYS`	`int`	`30`	Refresh token time-to-live
`STRIPE_SECRET_KEY`	`str`	`""`	Stripe API key (empty = stub mode)
`STRIPE_WEBHOOK_SECRET`	`str`	`""`	Stripe webhook signature secret
`S3_BUCKET`	`str`	`""`	S3 bucket for encrypted blobs and backups
`S3_REGION`	`str`	`us-east-1`	AWS region
`S3_ENDPOINT_URL`	`str`	`""`	Custom S3 endpoint (e.g. `http://minio:9000` for MinIO). Leave empty for AWS.
`AWS_ACCESS_KEY_ID`	`str`	`""`	AWS credentials
`AWS_SECRET_ACCESS_KEY`	`str`	`""`	AWS credentials
`PINECONE_API_KEY`	`str`	`""`	Pinecone API key (if set, Pinecone is used for vectors)
`PINECONE_INDEX`	`str`	`adiuva`	Pinecone index name
`QDRANT_URL`	`str`	`""`	Qdrant URL (used when Pinecone is not configured)
`QDRANT_API_KEY`	`str`	`""`	Qdrant API key
`OPENAI_API_KEY`	`str`	`""`	OpenAI key for LLM agent calls
`LLM_MODEL`	`str`	`gpt-4o`	LiteLLM model identifier for agents (e.g. `anthropic/claude-3.5-sonnet`, `gemini/gemini-pro`, `ollama/llama3`)
`LLM_ROUTER_MODEL`	`str`	`gpt-4o-mini`	Lighter model used for intent classification / routing
`CORS_ORIGINS`	`list[str]`	`["app://.", "http://localhost:3000", "http://localhost:5173"]`	Allowed CORS origins
`ENV`	`Literal`	`dev`	`dev` or `prod` — controls `/docs` visibility and SQL echo

API Reference

All routes are prefixed with /api/v1. 27 endpoints total (25 REST + 1 WebSocket + 1 health check).

Health

Method	Path	Auth	Description
`GET`	`/api/v1/health`	No	Returns `{"status": "ok", "version": "0.1.0"}`

Auth

Method	Path	Auth	Description
`POST`	`/api/v1/auth/register`	No	Create account with bcrypt-hashed password, returns `AuthTokens`
`POST`	`/api/v1/auth/login`	No	Validate credentials, returns `AuthTokens`
`POST`	`/api/v1/auth/refresh`	No	Rotate refresh token, returns new `AuthTokens`
`GET`	`/api/v1/auth/me`	JWT	Returns `UserProfile` for the authenticated user

Chat

Method	Path	Auth	Description
`POST`	`/api/v1/chat`	JWT	Route message through the orchestrator; returns `ChatResponse` or `ExecutionPlan` depending on execution mode
`WS`	`/api/v1/chat/stream`	JWT (query param `?token=`)	Streaming chat — first frame is a `ChatRequest`, server yields text chunks, final frame is `{"done": true, "response": "...", "actions": [...]}`. 30-second heartbeat ping.

Plans

Method	Path	Auth	Description
`GET`	`/api/v1/plans/playbook`	JWT	List all cached execution plan playbooks
`GET`	`/api/v1/plans/playbook/{plan_id}`	JWT	Retrieve a specific playbook by ID

Storage (Cloud Records)

Method	Path	Auth	Description
`POST`	`/api/v1/storage/records`	JWT	Upload an E2E encrypted record (verifies checksum, enforces storage quota)
`GET`	`/api/v1/storage/records`	JWT	List record metadata with pagination (`?table`, `?page`, `?limit`); no blob bytes returned
`GET`	`/api/v1/storage/records/{id}`	JWT	Download encrypted blob with `X-Checksum` response header
`PUT`	`/api/v1/storage/records/{id}`	JWT	Replace an existing blob (verifies checksum, enforces quota)
`DELETE`	`/api/v1/storage/records/{id}`	JWT	Delete a record and its S3 blob

Vectors (Cloud Vector Store)

Method	Path	Auth	Description
`POST`	`/api/v1/storage/vectors/upsert`	JWT	Verify checksums and upsert encrypted vectors
`POST`	`/api/v1/storage/vectors/search`	JWT	Search user-scoped vector namespace
`DELETE`	`/api/v1/storage/vectors`	JWT	Delete vectors by ID list

Backup

Method	Path	Auth	Description
`PUT`	`/api/v1/backup`	JWT	Upload encrypted backup blob with custom headers (`X-Backup-Version`, `X-Backup-Timestamp`, `X-Backup-Checksum`). Tier quota enforced.
`GET`	`/api/v1/backup`	JWT	Download latest backup blob. Supports `If-Modified-Since`.
`GET`	`/api/v1/backup/history`	JWT	List backup metadata (no blob content)
`DELETE`	`/api/v1/backup/{backup_id}`	JWT	Delete a specific backup

Plugins (Marketplace)

Method	Path	Auth	Description
`GET`	`/api/v1/plugins`	JWT (Power+)	Browse the marketplace (`?category`, `?q`, `?page`, `?sort=rating\|installs\|newest`)
`GET`	`/api/v1/plugins/{id}`	JWT (Power+)	Plugin detail with install count and ratings
`POST`	`/api/v1/plugins/{id}/install`	JWT (Power+)	Install plugin; triggers Stripe Connect revenue split for paid plugins
`DELETE`	`/api/v1/plugins/{id}/install`	JWT	Uninstall plugin

Billing

Method	Path	Auth	Description
`POST`	`/api/v1/billing/checkout`	JWT	Create a Stripe checkout session, returns `{"checkout_url": "..."}`
`POST`	`/api/v1/billing/webhook`	Stripe signature	Handle Stripe events: `checkout.session.completed`, `customer.subscription.updated`, `customer.subscription.deleted`, `invoice.payment_failed`
`GET`	`/api/v1/billing/subscription`	JWT	Get current subscription information
`DELETE`	`/api/v1/billing/subscription`	JWT	Cancel subscription and revert to free tier

Data Model

9 tables managed by Alembic migrations. Source: app/models.py

Tables

Table	Primary Key	Key Columns	Purpose
`users`	`id` (UUID)	`email` (unique), `password_hash`, `tier`, `stripe_customer_id`, timestamps	User accounts
`refresh_tokens`	`id` (UUID)	`user_id` (FK), `token_hash` (SHA-256, unique), `expires_at`	Hashed refresh tokens for rotation
`subscriptions`	`id` (UUID)	`user_id` (FK, unique), `stripe_subscription_id`, `tier`, `status`, `current_period_end`	Stripe subscription records
`storage_records`	`id` (UUID)	`user_id` (FK), `table_name`, `s3_key`, `checksum`, `size_bytes`, timestamps	S3 blob metadata (no plaintext content)
`backup_metadata`	`id` (UUID)	`user_id` (FK), `s3_key`, `version`, `timestamp`, `checksum`, `size_bytes`	Backup manifests
`plugins`	`id` (String)	`name`, `description`, `version`, `author_id` (FK), `category`, `price_cents`, `permissions` (JSON), `status`, `s3_package_key`, `install_count`, `avg_rating`	Marketplace plugin catalog
`plugin_installations`	`id` (UUID)	`plugin_id` (FK), `user_id` (FK), unique constraint on (`plugin_id`, `user_id`)	Per-user install tracking
`plugin_reviews`	`id` (UUID)	`plugin_id` (FK), `reviewer_id` (FK), `decision`, `notes`, `reviewed_at`	Admin review decisions
`revenue_events`	`id` (UUID)	`plugin_id` (FK), `user_id` (FK), `amount_cents`, `developer_share_cents`, `stripe_transfer_id`	70/30 revenue split ledger

Enum Types

Enum	Values
`billing_tier`	`free`, `pro`, `power`, `team`
`plugin_status`	`pending_review`, `approved`, `rejected`
`review_decision`	`approved`, `rejected`

Migrations

Version	Description
`001_initial_schema`	Creates all 9 tables with indexes and foreign key constraints
`002_seed_plugins`	Seeds 3 approved plugins: GitHub Sync (free), Slack Notifier (€4.99), Time Tracker (€9.99)

AI Agent System

The agent system uses a registry pattern with LangChain tool-calling agents powered by GPT-4o. Source: app/agents/, app/core/agent_registry.py

Architecture

BaseAgent — Abstract base with user_id, shared_memory, and vector_store_context.
ChatAgent(BaseAgent) — Abstract handle(query, context) and get_tools() methods, plus a shared _tool_loop(llm, messages, tools, max_iter=5) for iterative tool calling.
AgentRegistry — Singleton registry with @register decorator, get(name), list_agents(), and call_agent(name, query, context).

Registered Agents

Agent	Registry Name	Tools	Description
TaskAgent	`task_agent`	8	Full task and comment CRUD. Status: `todo` / `in_progress` / `done`. Priority: `high` / `medium` / `low`. Tools: `list_tasks`, `create_task`, `update_task`, `delete_task`, `list_tasks_due_today`, `list_task_comments`, `add_task_comment`, `delete_task_comment`
ProjectAgent	`project_agent`	6	Project lifecycle management. Status: `active` / `archived`. Prefers archiving over deletion. Tools: `list_projects`, `list_all_projects`, `get_project`, `create_project`, `update_project`, `delete_project`
TimelineAgent	`timeline_agent`	4	Project milestones. Requires `project_id` for creation. Supports AI-suggestion and approval workflows. Tools: `list_timelines`, `create_timeline`, `update_timeline`, `delete_timeline`
NoteAgent	`note_agent`	5	Markdown note management. Optionally linked to projects. Tools: `list_notes`, `get_note`, `create_note`, `update_note`, `delete_note`

All agents use the model configured by LLM_MODEL (default: GPT-4o) with temperature=0 via LiteLLM. Tools return JSON action descriptors that the Electron client interprets and applies locally.

Switching LLM Providers

The backend uses LiteLLM as a universal LLM gateway. All agents and the orchestrator instantiate models through a centralized factory in app/core/llm.py. To switch providers, change environment variables — no code changes required:

# OpenAI (default)
LLM_MODEL=gpt-4o
LLM_ROUTER_MODEL=gpt-4o-mini

# Anthropic
LLM_MODEL=anthropic/claude-3.5-sonnet
LLM_ROUTER_MODEL=anthropic/claude-3-haiku

# Google Gemini
LLM_MODEL=gemini/gemini-pro
LLM_ROUTER_MODEL=gemini/gemini-flash

# Local Ollama
LLM_MODEL=ollama/llama3
LLM_ROUTER_MODEL=ollama/llama3

# AWS Bedrock
LLM_MODEL=bedrock/anthropic.claude-v2
LLM_ROUTER_MODEL=bedrock/anthropic.claude-instant-v1

See the LiteLLM provider docs for the full list of 100+ supported providers and model naming conventions.

Orchestration & Execution Plans

Source: app/core/orchestrator.py, app/core/execution_plan.py

Orchestrator

classify_intent(message, context, registry) — Uses the router model (LLM_ROUTER_MODEL, default: GPT-4o-mini) to determine which agent should handle a message. Falls back to task_agent when classification is ambiguous.
route_single(agent_name, message, context) — Routes to a single agent and returns a ChatResponse.
route_pipeline(agent_names, message, context) — Executes agents sequentially; each receives previous_results from earlier agents. A final LLM synthesis step merges all results.
orchestrate(request) — Main entry point. In direct mode, returns a ChatResponse. In plan mode, returns an ExecutionPlan.
orchestrate_stream(request) — Streaming variant that yields 50-character text chunks with a final JSON frame.

Execution Plans

PromptTemplateRegistry — Maps template IDs to server-side prompt text. Clients only ever see opaque IDs, never raw prompts.
ExecutionPlanBuilder — Fluent builder API: add_step(), add_llm_step(template_id, vars), add_data_step(action, data_from_step). Validates step references on build().
PlanCache — LRU cache (maxsize 1000) for storing plans as reusable playbooks.

Built-in Templates (6)

tpl_task_agent_default, tpl_timeline_agent_default, tpl_project_agent_default, tpl_note_agent_default, tpl_task_extract_from_project, tpl_note_weekly_summary

Built-in Playbooks (2)

Playbook	Description
`create_tasks_from_project`	LLM extracts actionable tasks from project context, then creates task records
`generate_weekly_note`	LLM generates a weekly summary, then creates a note record

Middleware

Middleware executes in this order on each request: TierRateLimit → Sanitizer → CORS → Router

JWT Authentication

Source: app/api/middleware/auth.py

FastAPI dependency get_current_user validates the Bearer JWT and extracts user_id and email.
Live tier lookup — The current tier is fetched from the subscriptions table on every request (not cached in the JWT), so upgrades and downgrades take immediate effect.
Falls back to free when no subscription row exists.
Raises 401 Unauthorized on invalid or expired tokens.
Exempt paths: /api/v1/auth/register, /api/v1/auth/login, /api/v1/billing/webhook

Tier-Based Rate Limiter

Source: app/api/middleware/rate_limit.py

TierRateLimitMiddleware — Sliding-window in-process rate limiter (no Redis dependency).
Per-user 60-second window sized by subscription tier:

Tier	Requests / Minute
Free	20
Pro	60
Power	120
Team	200

Returns 429 Too Many Requests with a Retry-After header when the limit is exceeded.
Exempt paths: register, login, webhook, health

Response Sanitizer

Source: app/api/middleware/sanitizer.py

Runs only on /api/v1/chat endpoints.
Scans JSON response bodies and replaces leaked prompt IP fragments with [REDACTED].
Detects: system prompt openers, agent routing metadata, LangChain tool schemas, internal reasoning markers (<thinking>, [INST]), and known prompt fingerprints.
Logs sanitization events as WARNING.
Binary responses (storage, backup) are never touched.

Storage Layer

Blob Store

Source: app/storage/blob_store.py

S3-backed storage for E2E encrypted blobs.
Object keys follow the pattern: {user_id}/{table}/{record_id}
Server-side SSE-S3 encryption at rest (additional layer on top of client-side E2E encryption).
Methods: upload(), download(), delete() (idempotent), list_keys()
The backend never inspects or decrypts blob content.

Vector Store

Source: app/storage/vector_store.py

Runtime-configurable: Pinecone (when PINECONE_API_KEY is set) or Qdrant (fallback).
User isolation: Pinecone uses namespace=user_id; Qdrant filters by user_id payload field.
32-dimensional SHA-256-derived float vectors (deterministic, not semantically meaningful on encrypted data — a documented trade-off for privacy).
Encrypted blobs are stored as base64 in metadata/payload for verbatim retrieval.
Methods: upsert(), search(), delete()

Encryption Utilities

Source: app/storage/encryption.py

verify_checksum(blob, checksum) — SHA-256 hash comparison using hmac.compare_digest (constant-time to prevent timing attacks).
reject_if_tampered(blob, checksum) — Raises HTTP 400 on checksum mismatch.
No decryption key ever reaches the backend.

Billing & Tiers

Source: app/billing/stripe_service.py, app/billing/tier_manager.py

Feature Matrix

Feature	Free	Pro	Power	Team
AI Agents	3	Unlimited	Unlimited	Unlimited
Batch Active	2	10	Unlimited	Unlimited
Cloud Storage	0 GB	5 GB	25 GB	Unlimited
Backup Storage	0 GB	5 GB	25 GB	Unlimited
LLM Providers	1	Unlimited	Unlimited	Unlimited
Batch Builder	—	—	✓	✓
Plugin Marketplace	—	—	✓	✓
SSO	—	—	—	✓
Rate Limit	20 req/min	60 req/min	120 req/min	200 req/min

Stripe Integration

Checkout — create_checkout_session(user_id, tier) creates a Stripe Checkout session. Returns a stub URL when Stripe is not configured.
Webhooks — Handles checkout.session.completed, customer.subscription.updated, customer.subscription.deleted, and invoice.payment_failed.
Subscription management — get_subscription() returns the current subscription record; cancel_subscription() cancels via the Stripe API and reverts the user to the free tier.
Price IDs: price_pro_monthly, price_power_monthly, price_team_monthly

Tier Manager

get_tier(user_id) — Returns the user's current billing tier.
check_feature(tier, feature) — Boolean feature gate check.
require_feature(tier, feature) — Raises HTTP 403 if the feature is not available.
enforce_quota(user_id, tier) / enforce_backup_quota(user_id, tier) — Raises HTTP 402 if storage limits are exceeded.

Plugin Marketplace

Source: app/marketplace/

Plugin Registry

PostgreSQL-backed catalog of submitted and approved plugins.
list_plugins(db, category, query, page, sort) — Paginated listing (page size: 20) with optional filtering by category, text search, and sorting by rating, installs, or newest.
get_plugin(db, plugin_id) — Full manifest with install count and ratings.
submit_plugin(db, manifest, s3_key) — Submits a plugin with pending_review status.
approve_plugin() / reject_plugin(reason) — Admin workflow for plugin approval.
record_install() / record_uninstall() — Tracks per-user installations and updates install counts.

Review Queue

Automated security checklist before human review:
- Plugin ID must match ^[a-z0-9-]+$
- Permissions must be from the allowed set only
- No binary blobs in the manifest
Allowed permissions: read:tasks, write:tasks, read:projects, write:projects, read:notes, write:notes, read:timelines, write:timelines, read:calendar, write:calendar
get_pending(db) — Lists plugins awaiting review.
submit_review(db, plugin_id, reviewer_id, decision, notes) — Records the review decision.

70% developer / 30% platform split on all paid plugin sales.
record_install(db, plugin_id, user_id, amount_cents) — Records the revenue event and triggers a Stripe Connect transfer for the developer share.
get_earnings(db, developer_id, period) — Aggregated earnings report for plugin developers.
Gracefully stubs transfers when Stripe is not configured.

Seed Plugins

Plugin	Category	Price
GitHub Sync	Productivity	Free
Slack Notifier	Communication	€4.99
Time Tracker	Productivity	€9.99

Testing

Running Tests

# Run all tests
pytest

# Run a specific test file
pytest tests/test_auth.py

# Run with verbose output
pytest -v

Test Infrastructure

Database: Async SQLite in-memory via aiosqlite + StaticPool — fast, no PostgreSQL needed.
S3 mock: moto[s3] with a fixture that patches BlobStore settings.
Auth helpers: make_jwt(tier) and auth_header(tier) generate per-tier test tokens.
Seed data: Auto-creates one User + Subscription per tier (free/pro/power/team) before each test.
Plugin seeds: Fixture adds 3 approved plugins for marketplace tests.
FK enforcement: SQLite PRAGMA foreign_keys=ON.
No external dependencies — all tests run fully offline.

Test Coverage

File	Coverage
`test_auth.py`	Register, login, token access, refresh, expiration
`test_orchestrator.py`	Intent classification, single agent routing, pipeline, plan mode
`test_agents.py`	Each agent with mocked LLM: registration, tools, handle method
`test_storage.py`	Create, list, download, update, delete records; checksum rejection; quota enforcement
`test_backup.py`	Upload, download, history, delete; tier-based storage limits
`test_plugins.py`	List, install, uninstall, revenue events, tier gate enforcement
`test_agent_registry.py`	Registry singleton, registration, lookup, listing
`test_execution_plan.py`	Plan builder, template registry, plan cache
`test_middleware.py`	Rate limiting by tier, sanitizer prompt leak detection

Project Structure

adiuva-api/
├── alembic.ini                  # Alembic configuration
├── BACKEND_PLAN.md              # Architecture & design decisions
├── docker-compose.yml           # Docker Compose (app + PostgreSQL)
├── Dockerfile                   # Multi-stage production build
├── requirements.txt             # Python dependencies
│
├── alembic/                     # Database migrations
│   ├── env.py                   # Alembic environment config
│   ├── script.py.mako           # Migration template
│   └── versions/
│       ├── 001_initial_schema.py    # Tables, indexes, FKs
│       └── 002_seed_plugins.py      # Seed marketplace plugins
│
├── app/                         # Application source
│   ├── main.py                  # FastAPI app factory, middleware, routes
│   ├── db.py                    # Async SQLAlchemy engine & session
│   ├── models.py                # SQLAlchemy ORM models (9 tables)
│   ├── schemas.py               # Pydantic request/response schemas
│   │
│   ├── config/
│   │   └── settings.py          # Pydantic Settings (env vars)
│   │
│   ├── agents/                  # LLM-powered domain agents
│   │   ├── task_agent.py        # Task & comment CRUD (8 tools)
│   │   ├── project_agent.py     # Project lifecycle (6 tools)
│   │   ├── timeline_agent.py  # Milestones (4 tools)
│   │   └── note_agent.py        # Markdown notes (5 tools)
│   │
│   ├── core/                    # Orchestration engine
│   │   ├── agent_registry.py    # BaseAgent, ChatAgent, AgentRegistry
│   │   ├── llm.py               # LiteLLM factory (get_llm, get_router_llm)
│   │   ├── orchestrator.py      # Intent classification & routing
│   │   └── execution_plan.py    # Plan builder, templates, cache
│   │
│   ├── api/                     # HTTP layer
│   │   ├── deps.py              # Shared FastAPI dependencies
│   │   ├── middleware/
│   │   │   ├── auth.py          # JWT validation, live tier lookup
│   │   │   ├── rate_limit.py    # Sliding-window tier rate limiter
│   │   │   └── sanitizer.py     # Prompt IP leak protection
│   │   └── routes/
│   │       ├── auth.py          # Register, login, refresh, me
│   │       ├── chat.py          # Chat + WebSocket streaming
│   │       ├── plans.py         # Execution plan playbooks
│   │       ├── storage.py       # E2E encrypted record CRUD
│   │       ├── vectors.py       # Vector upsert, search, delete
│   │       ├── backup.py        # Encrypted backup management
│   │       ├── plugins.py       # Marketplace browse & install
│   │       └── billing.py       # Stripe checkout & webhooks
│   │
│   ├── storage/                 # Storage backends
│   │   ├── blob_store.py        # S3 blob storage
│   │   ├── vector_store.py      # Pinecone / Qdrant vector store
│   │   └── encryption.py        # Checksum verification utilities
│   │
│   ├── billing/                 # Subscription management
│   │   ├── stripe_service.py    # Stripe API integration
│   │   └── tier_manager.py      # Feature matrix & quota enforcement
│   │
│   └── marketplace/             # Plugin ecosystem
│       ├── plugin_registry.py   # Catalog CRUD & search
│       ├── plugin_review.py     # Security checklist & review queue
│       └── revenue_share.py     # 70/30 split & Stripe Connect
│
└── tests/                       # Test suite
    ├── conftest.py              # Fixtures: DB, S3, auth, seeds
    ├── test_auth.py
    ├── test_orchestrator.py
    ├── test_agents.py
    ├── test_storage.py
    ├── test_backup.py
    ├── test_plugins.py
    ├── test_agent_registry.py
    ├── test_execution_plan.py
    └── test_middleware.py

License

To be determined.

README.md

Adiuva Cloud API

Table of Contents

Overview

Design Principles

Architecture

Key Features

Tech Stack

Getting Started

Prerequisites

Installation

Database Setup

Run the Development Server

Docker Deployment

Quick Start

Dockerfile Details

Homelab / Self-Hosted Deployment

1. Start all services

2. Create the MinIO bucket

3. Configure your .env

4. Run migrations

What runs where

Environment Variables

API Reference

Health

Auth

Chat

Plans

Storage (Cloud Records)

Vectors (Cloud Vector Store)

Backup

Plugins (Marketplace)

Billing

Data Model

Tables

Enum Types

Migrations

AI Agent System

Architecture

Registered Agents

Switching LLM Providers

Orchestration & Execution Plans

Orchestrator

Execution Plans

Built-in Templates (6)

Built-in Playbooks (2)

Middleware

JWT Authentication

Tier-Based Rate Limiter

Response Sanitizer

Storage Layer

Blob Store

Vector Store

Encryption Utilities

Billing & Tiers

Feature Matrix

Stripe Integration

Tier Manager

Plugin Marketplace

Plugin Registry

Review Queue

Revenue Sharing

Seed Plugins

Testing

Running Tests

Test Infrastructure

Test Coverage

Project Structure

License

3. Configure your `.env`