Roberto Musso 5753f8def9 refactor: remove storage, backup, plugin/marketplace features
- Delete app/storage/ (blob_store, vector_store, encryption)
- Delete app/marketplace/ (plugin_registry, plugin_review, revenue_share)
- Delete routes: backup.py, plugins.py, storage.py, vectors.py
- Relocate embed endpoint to POST /chat/embed
- Rewrite migration 001 (remove storage/plugin tables)
- Delete migration 002 (seed_plugins)
- Remove S3/Pinecone/Qdrant env vars from settings
- Remove storage/backup quotas from tier_manager
- Remove MinIO and Qdrant from docker-compose
- Delete tests: test_backup, test_plugins, test_storage
- Update README.md and clean .env.example
2026-04-08 00:47:37 +02:00
2026-03-03 15:14:04 +01:00
2026-03-20 23:48:36 +01:00
2026-03-03 12:39:32 +01:00
2026-03-03 15:14:04 +01:00
2026-03-10 09:11:24 +01:00

Adiuva Cloud API

AI-powered project management backend with LLM orchestration and subscription billing.

Built with FastAPI · Python 3.12 · PostgreSQL · LangChain · Stripe


Table of Contents


Overview

Adiuva Cloud API is the FastAPI backend that powers the Adiuva Electron desktop app. It provides LLM-powered chat orchestration, text embedding generation, and Stripe-based subscription billing across four tiers.

Design Principles

  1. Never expose prompts — system prompts stay server-side; responses are sanitized to strip any leaked prompt fragments.
  2. Stateless request handling — all context comes from the client and JWT; no server-side session state.
  3. Tier gates enforced server-side — the server always reads the current tier from the database, never trusting client-reported values.

Architecture

┌──────────────┐      ┌────────────────────────────────────────────────────────┐
│  Electron    │      │  FastAPI  (Uvicorn / Gunicorn)                         │
│  Desktop App │────▶│                                                        │
│  (Client)    │◀────│  Middleware: RateLimit → Sanitizer → CORS → Router     │
└──────────────┘      │                                                        │
                      │  ┌──────────────────┐  ┌────────────────────────────┐  │
                      │  │  Auth Routes     │  │  Chat Routes               │  │
                      │  │  Billing Routes  │  │    ↓                       │  │
                      │  │  Agent Routes    │  │  Orchestrator (GPT-4o-mini)│  │
                      │  │  Device WS       │  │    ↓ classify intent       │  │
                      │  └──────────────────┘  │  Agent Registry            │  │
                      │                        │    ↓                       │  │
                      │                        │  TaskAgent  | ProjectAgent │  │
                      │                        │  NoteAgent  | CheckptAgent │  │
                      │                        │  (GPT-4o + LangChain)      │  │
                      │                        └────────────────────────────┘  │
                      └────────────────────────────────────────────────────────┘
                               │
                      ┌────────▼───┐
                      │ PostgreSQL │
                      │ (Auth,     │
                      │  Billing,  │
                      │  Agents)   │
                      └────────────┘
                               │
                      ┌────────▼───┐
                      │  Stripe    │
                      │  (Billing) │
                      └────────────┘

Key Features

  1. LLM-powered orchestration — GPT-4o-mini classifies user intent and routes to the appropriate domain agent.
  2. 4 specialized AI agents — Tasks (8 tools), Projects (6 tools), Timelines (4 tools), Notes (5 tools), all powered by GPT-4o via LangChain.
  3. Execution plans & playbooks — Server-side prompt template registry; clients receive only opaque template IDs, never raw prompts.
  4. Text embeddings — Generates text-embedding-3-small vectors for local client-side note search.
  5. Stripe billing — Four-tier subscription model (Free / Pro / Power / Team) with checkout sessions and full webhook lifecycle handling.
  6. JWT authentication — Access + refresh tokens with bcrypt password hashing, SHA-256 token hashing, and automatic rotation.
  7. Prompt IP protection — Sanitizer middleware strips system prompts, reasoning markers, tool schemas, and agent routing metadata from all chat responses.
  8. Tier-based rate limiting — Sliding-window per-user limiter scaling from 20 to 200 requests/min by subscription tier.
  9. WebSocket streaming — Real-time chat with 30-second heartbeat keep-alive and chunked text delivery.
  10. Alembic migrations — Versioned schema management.
  11. Comprehensive test suite — In-memory SQLite, per-tier test fixtures, and full API coverage without external dependencies.

Tech Stack

Package Version Purpose
fastapi ≥ 0.115.0 Web framework
uvicorn[standard] ≥ 0.34.0 ASGI development server
gunicorn ≥ 22.0.0 Production process manager
langchain ≥ 0.3.0 LLM orchestration framework
langchain-openai ≥ 0.3.0 OpenAI LLM provider integration
litellm ≥ 1.50.0 Universal LLM gateway (100+ providers)
pydantic ≥ 2.10.0 Data validation and serialization
pydantic-settings ≥ 2.7.0 Environment-based configuration
python-jose[cryptography] ≥ 3.3.0 JWT encoding and decoding
stripe ≥ 11.0.0 Billing and payment integration
slowapi ≥ 0.1.9 Rate limiting utilities
sqlalchemy ≥ 2.0.0 Async ORM and query builder
asyncpg ≥ 0.30.0 PostgreSQL async driver
alembic ≥ 1.14.0 Database migration management
bcrypt ≥ 4.2.0 Password hashing
python-dotenv ≥ 1.0.0 .env file loading
httpx ≥ 0.28.0 Async HTTP client (used in tests)
websockets ≥ 14.0 WebSocket protocol support
psycopg2-binary ≥ 2.9.0 Synchronous PostgreSQL driver (Alembic)
pytest ≥ 8.0.0 Test framework
pytest-asyncio ≥ 0.24.0 Async test support
aiosqlite ≥ 0.20.0 In-memory SQLite for tests
ruff ≥ 0.8.0 Linter and formatter

Getting Started

Prerequisites

  • Python 3.12+
  • PostgreSQL 16+
  • An OpenAI API key (for LLM features)
  • Stripe API keys (optional — billing stubs gracefully when unconfigured)

Installation

# Clone the repository
git clone <repo-url> && cd adiuva-api

# Create a virtual environment
python -m venv .venv && source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your DATABASE_URL, OPENAI_API_KEY, etc.

Database Setup

# Start PostgreSQL (or use the Docker Compose database)
docker compose up db -d

# Run migrations
alembic upgrade head

Run the Development Server

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Interactive API docs are available at http://localhost:8000/docs in development mode (ENV=dev). The /docs endpoint is disabled in production.


Docker Deployment

Quick Start

docker compose up --build

This starts two services:

  • app — FastAPI server on port 8000
  • db — PostgreSQL 16 (Alpine) on port 5432 with a persistent volume and health checks

Dockerfile Details

The Dockerfile uses a multi-stage build:

  1. Builder stage — Installs Python dependencies into a virtual environment.
  2. Runtime stage — Copies only the venv, app source, and Alembic migrations. Runs as a non-root user (appuser).
  3. Production server — Gunicorn with 4 Uvicorn workers, 120-second timeout, listening on port 8000.
# Production command (run by the container)
gunicorn app.main:app -k uvicorn.workers.UvicornWorker -w 4 --timeout 120 -b 0.0.0.0:8000

Homelab / Self-Hosted Deployment

You can run the entire stack locally on a homelab with no cloud dependencies except the LLM provider.

1. Start all services

docker compose up -d

This starts PostgreSQL alongside the app.

2. Configure your .env

# Database (uses the compose PostgreSQL)
DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/adiuva

# Billing — leave empty to stub (no Stripe needed)
STRIPE_SECRET_KEY=
STRIPE_WEBHOOK_SECRET=

# LLM — the only external service
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o
LLM_ROUTER_MODEL=gpt-4o-mini

# Auth
JWT_SECRET=your-secret-here
ENV=dev

3. Run migrations

docker compose exec app alembic upgrade head

What runs where

Service Runs on Port Notes
FastAPI app Docker 8000 API server
PostgreSQL Docker 5432 Auth, billing, agents
Stripe Stubbed when keys are empty
OpenAI / LLM Cloud Only external dependency

Want fully offline AI too? Set LLM_MODEL=ollama/llama3 and LLM_ROUTER_MODEL=ollama/llama3, then add an Ollama container or point at a local Ollama instance. See the LLM provider switching section.


Environment Variables

All variables are loaded from a .env file via Pydantic Settings. Source: app/config/settings.py

Variable Type Default Description
DATABASE_URL str postgresql+asyncpg://postgres:postgres@localhost:5432/adiuva Async SQLAlchemy connection string
JWT_SECRET str change-me-in-production HMAC secret for JWT signing
JWT_ALGORITHM str HS256 JWT signing algorithm
JWT_ACCESS_TOKEN_EXPIRE_MINUTES int 30 Access token time-to-live
JWT_REFRESH_TOKEN_EXPIRE_DAYS int 30 Refresh token time-to-live
STRIPE_SECRET_KEY str "" Stripe API key (empty = stub mode)
STRIPE_WEBHOOK_SECRET str \"\" Stripe webhook signature secret
LLM_MODEL str gpt-4o LiteLLM model identifier for agents (e.g. anthropic/claude-3.5-sonnet, gemini/gemini-pro, ollama/llama3)
LLM_ROUTER_MODEL str gpt-4o-mini Lighter model used for intent classification / routing
CORS_ORIGINS list[str] ["app://.", "http://localhost:3000", "http://localhost:5173"] Allowed CORS origins
ENV Literal dev dev or prod — controls /docs visibility and SQL echo

API Reference

All routes are prefixed with /api/v1. 27 endpoints total (25 REST + 1 WebSocket + 1 health check).

Health

Method Path Auth Description
GET /api/v1/health No Returns {"status": "ok", "version": "0.1.0"}

Auth

Method Path Auth Description
POST /api/v1/auth/register No Create account with bcrypt-hashed password, returns AuthTokens
POST /api/v1/auth/login No Validate credentials, returns AuthTokens
POST /api/v1/auth/refresh No Rotate refresh token, returns new AuthTokens
GET /api/v1/auth/me JWT Returns UserProfile for the authenticated user

Chat

Method Path Auth Description
POST /api/v1/chat JWT Route message through the orchestrator; returns ChatResponse or ExecutionPlan depending on execution mode
POST /api/v1/chat/embed JWT Generate a 1536-dim text embedding vector (text-embedding-3-small). Used by Electron for local note search.
WS /api/v1/chat/stream JWT (query param ?token=) Streaming chat — first frame is a ChatRequest, server yields text chunks, final frame is {"done": true, "response": "...", "actions": [...]}. 30-second heartbeat ping.

Plans

Method Path Auth Description
GET /api/v1/plans/playbook JWT List all cached execution plan playbooks
GET /api/v1/plans/playbook/{plan_id} JWT Retrieve a specific playbook by ID

Billing

Method Path Auth Description
POST /api/v1/billing/checkout JWT Create a Stripe checkout session, returns {"checkout_url": "..."}
POST /api/v1/billing/webhook Stripe signature Handle Stripe events: checkout.session.completed, customer.subscription.updated, customer.subscription.deleted, invoice.payment_failed
GET /api/v1/billing/subscription JWT Get current subscription information
DELETE /api/v1/billing/subscription JWT Cancel subscription and revert to free tier

Data Model

3 tables managed by Alembic migrations. Source: app/models.py

Tables

Table Primary Key Key Columns Purpose
users id (UUID) email (unique), password_hash, tier, stripe_customer_id, timestamps User accounts
refresh_tokens id (UUID) user_id (FK), token_hash (SHA-256, unique), expires_at Hashed refresh tokens for rotation
subscriptions id (UUID) user_id (FK, unique), stripe_subscription_id, tier, status, current_period_end Stripe subscription records

Enum Types

Enum Values
billing_tier free, pro, power, team

Migrations

Version Description
001_initial_schema Creates core auth and billing tables with indexes and foreign key constraints

AI Agent System

The agent system uses a registry pattern with LangChain tool-calling agents powered by GPT-4o. Source: app/agents/, app/core/agent_registry.py

Architecture

  • BaseAgent — Abstract base with user_id and shared_memory.
  • ChatAgent(BaseAgent) — Abstract handle(query, context) and get_tools() methods, plus a shared _tool_loop(llm, messages, tools, max_iter=5) for iterative tool calling.
  • AgentRegistry — Singleton registry with @register decorator, get(name), list_agents(), and call_agent(name, query, context).

Registered Agents

Agent Registry Name Tools Description
TaskAgent task_agent 8 Full task and comment CRUD. Status: todo / in_progress / done. Priority: high / medium / low. Tools: list_tasks, create_task, update_task, delete_task, list_tasks_due_today, list_task_comments, add_task_comment, delete_task_comment
ProjectAgent project_agent 6 Project lifecycle management. Status: active / archived. Prefers archiving over deletion. Tools: list_projects, list_all_projects, get_project, create_project, update_project, delete_project
TimelineAgent timeline_agent 4 Project milestones. Requires project_id for creation. Supports AI-suggestion and approval workflows. Tools: list_timelines, create_timeline, update_timeline, delete_timeline
NoteAgent note_agent 5 Markdown note management. Optionally linked to projects. Tools: list_notes, get_note, create_note, update_note, delete_note

All agents use the model configured by LLM_MODEL (default: GPT-4o) with temperature=0 via LiteLLM. Tools return JSON action descriptors that the Electron client interprets and applies locally.

Switching LLM Providers

The backend uses LiteLLM as a universal LLM gateway. All agents and the orchestrator instantiate models through a centralized factory in app/core/llm.py. To switch providers, change environment variables — no code changes required:

# OpenAI (default)
LLM_MODEL=gpt-4o
LLM_ROUTER_MODEL=gpt-4o-mini

# Anthropic
LLM_MODEL=anthropic/claude-3.5-sonnet
LLM_ROUTER_MODEL=anthropic/claude-3-haiku

# Google Gemini
LLM_MODEL=gemini/gemini-pro
LLM_ROUTER_MODEL=gemini/gemini-flash

# Local Ollama
LLM_MODEL=ollama/llama3
LLM_ROUTER_MODEL=ollama/llama3

# AWS Bedrock
LLM_MODEL=bedrock/anthropic.claude-v2
LLM_ROUTER_MODEL=bedrock/anthropic.claude-instant-v1

See the LiteLLM provider docs for the full list of 100+ supported providers and model naming conventions.


Orchestration & Execution Plans

Source: app/core/orchestrator.py, app/core/execution_plan.py

Orchestrator

  1. classify_intent(message, context, registry) — Uses the router model (LLM_ROUTER_MODEL, default: GPT-4o-mini) to determine which agent should handle a message. Falls back to task_agent when classification is ambiguous.
  2. route_single(agent_name, message, context) — Routes to a single agent and returns a ChatResponse.
  3. route_pipeline(agent_names, message, context) — Executes agents sequentially; each receives previous_results from earlier agents. A final LLM synthesis step merges all results.
  4. orchestrate(request) — Main entry point. In direct mode, returns a ChatResponse. In plan mode, returns an ExecutionPlan.
  5. orchestrate_stream(request) — Streaming variant that yields 50-character text chunks with a final JSON frame.

Execution Plans

  • PromptTemplateRegistry — Maps template IDs to server-side prompt text. Clients only ever see opaque IDs, never raw prompts.
  • ExecutionPlanBuilder — Fluent builder API: add_step(), add_llm_step(template_id, vars), add_data_step(action, data_from_step). Validates step references on build().
  • PlanCache — LRU cache (maxsize 1000) for storing plans as reusable playbooks.

Built-in Templates (6)

tpl_task_agent_default, tpl_timeline_agent_default, tpl_project_agent_default, tpl_note_agent_default, tpl_task_extract_from_project, tpl_note_weekly_summary

Built-in Playbooks (2)

Playbook Description
create_tasks_from_project LLM extracts actionable tasks from project context, then creates task records
generate_weekly_note LLM generates a weekly summary, then creates a note record

Middleware

Middleware executes in this order on each request: TierRateLimit → Sanitizer → CORS → Router

JWT Authentication

Source: app/api/middleware/auth.py

  • FastAPI dependency get_current_user validates the Bearer JWT and extracts user_id and email.
  • Live tier lookup — The current tier is fetched from the subscriptions table on every request (not cached in the JWT), so upgrades and downgrades take immediate effect.
  • Falls back to free when no subscription row exists.
  • Raises 401 Unauthorized on invalid or expired tokens.
  • Exempt paths: /api/v1/auth/register, /api/v1/auth/login, /api/v1/billing/webhook

Tier-Based Rate Limiter

Source: app/api/middleware/rate_limit.py

  • TierRateLimitMiddleware — Sliding-window in-process rate limiter (no Redis dependency).
  • Per-user 60-second window sized by subscription tier:
Tier Requests / Minute
Free 20
Pro 60
Power 120
Team 200
  • Returns 429 Too Many Requests with a Retry-After header when the limit is exceeded.
  • Exempt paths: register, login, webhook, health

Response Sanitizer

Source: app/api/middleware/sanitizer.py

  • Runs only on /api/v1/chat endpoints.
  • Scans JSON response bodies and replaces leaked prompt IP fragments with [REDACTED].
  • Detects: system prompt openers, agent routing metadata, LangChain tool schemas, internal reasoning markers (<thinking>, [INST]), and known prompt fingerprints.
  • Logs sanitization events as WARNING.

Billing & Tiers

Source: app/billing/stripe_service.py, app/billing/tier_manager.py

Feature Matrix

Feature Free Pro Power Team
AI Agents 3 Unlimited Unlimited Unlimited
Batch Active 2 10 Unlimited Unlimited
LLM Providers 1 Unlimited Unlimited Unlimited
Batch Builder
SSO
Rate Limit 20 req/min 60 req/min 120 req/min 200 req/min

Stripe Integration

  • Checkoutcreate_checkout_session(user_id, tier) creates a Stripe Checkout session. Returns a stub URL when Stripe is not configured.
  • Webhooks — Handles checkout.session.completed, customer.subscription.updated, customer.subscription.deleted, and invoice.payment_failed.
  • Subscription managementget_subscription() returns the current subscription record; cancel_subscription() cancels via the Stripe API and reverts the user to the free tier.
  • Price IDs: price_pro_monthly, price_power_monthly, price_team_monthly

Tier Manager

  • get_tier(user_id) — Returns the user's current billing tier.
  • check_feature(tier, feature) — Boolean feature gate check.
  • require_feature(tier, feature) — Raises HTTP 403 if the feature is not available.

Testing

Running Tests

# Run all tests
pytest

# Run a specific test file
pytest tests/test_auth.py

# Run with verbose output
pytest -v

Test Infrastructure

  • Database: Async SQLite in-memory via aiosqlite + StaticPool — fast, no PostgreSQL needed.
  • Auth helpers: make_jwt(tier) and auth_header(tier) generate per-tier test tokens.
  • Seed data: Auto-creates one User + Subscription per tier (free/pro/power/team) before each test.
  • FK enforcement: SQLite PRAGMA foreign_keys=ON.
  • No external dependencies — all tests run fully offline.

Test Coverage

File Coverage
test_auth.py Register, login, token access, refresh, expiration
test_middleware.py Rate limiting by tier, sanitizer prompt leak detection

Project Structure

adiuva-api/
├── alembic.ini                  # Alembic configuration
├── docker-compose.yml           # Docker Compose (app + PostgreSQL)
├── Dockerfile                   # Multi-stage production build
├── requirements.txt             # Python dependencies
│
├── alembic/                     # Database migrations
│   ├── env.py                   # Alembic environment config
│   ├── script.py.mako           # Migration template
│   └── versions/
│       └── 001_initial_schema.py    # Tables, indexes, FKs
│
├── app/                         # Application source
│   ├── main.py                  # FastAPI app factory, middleware, routes
│   ├── db.py                    # Async SQLAlchemy engine & session
│   ├── models.py                # SQLAlchemy ORM models
│   ├── schemas.py               # Pydantic request/response schemas
│   │
│   ├── config/
│   │   └── settings.py          # Pydantic Settings (env vars)
│   │
│   ├── agents/                  # LLM-powered domain agents
│   │   ├── task_agent.py        # Task & comment CRUD (8 tools)
│   │   ├── project_agent.py     # Project lifecycle (6 tools)
│   │   ├── timeline_agent.py    # Milestones (4 tools)
│   │   └── note_agent.py        # Markdown notes (5 tools)
│   │
│   ├── core/                    # Orchestration engine
│   │   ├── agent_registry.py    # BaseAgent, ChatAgent, AgentRegistry
│   │   ├── llm.py               # LiteLLM factory (get_llm, get_router_llm)
│   │   └── deep_agent.py        # Deep agent orchestration
│   │
│   ├── api/                     # HTTP layer
│   │   ├── deps.py              # Shared FastAPI dependencies
│   │   ├── middleware/
│   │   │   ├── rate_limit.py    # Sliding-window tier rate limiter
│   │   │   └── sanitizer.py     # Prompt IP leak protection
│   │   └── routes/
│   │       ├── auth.py          # Register, login, refresh, me
│   │       ├── chat.py          # Chat + embed endpoint
│   │       ├── billing.py       # Stripe checkout, webhooks, subscription
│   │       ├── agents.py        # Agent catalog, config, runs
│   │       └── device_ws.py     # Persistent device WebSocket
│   │
│   └── billing/
│       ├── stripe_service.py    # Stripe API wrapper
│       └── tier_manager.py      # Feature matrix, rate limits
│
└── tests/                       # Test suite
    ├── conftest.py              # Fixtures: DB, auth, seeds
    ├── test_auth.py
    ├── test_orchestrator.py
    ├── test_agents.py
    ├── test_agent_registry.py
    ├── test_execution_plan.py
    └── test_middleware.py

License

To be determined.

Description
No description provided
Readme 2.7 MiB
Languages
Python 99.4%
Dockerfile 0.4%
Mako 0.2%