- shared/config.py: add LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY, LANGFUSE_HOST - services/chat/app/tracing.py: new module — Langfuse client singleton, create_trace(), get_langfuse_callback(), get_prompt(), link_prompt_to_trace(), score_trace(), flush/shutdown helpers. Gracefully no-ops when keys are missing. - services/chat/app/llm.py: add callbacks param to get_llm() for LangChain callback handler injection - services/chat/app/deep_agent.py: accept langfuse_handler in all run_* and _run_single_agent* functions, pipe callbacks to LLM calls, fetch managed prompts from Langfuse with fallback to hardcoded system prompts - services/chat/app/redis_consumer.py: create Langfuse trace per request (home_request/floating_request), pass callback handler to deep_agent, link prompt name to trace, attach output preview, flush after each request - services/chat/app/main.py: shutdown Langfuse client in lifespan teardown - services/chat/requirements.txt: add langfuse>=2.0.0 Langfuse prompt names: 'home_system', 'floating_system' — create these in the Langfuse dashboard to manage prompts. Without them, hardcoded defaults are used transparently.
Adiuva Cloud API
AI-powered project management backend with E2E encrypted cloud storage, LLM orchestration, and a plugin marketplace.
Built with FastAPI · Python 3.12 · PostgreSQL · LangChain · Stripe · AWS S3
Table of Contents
- Overview
- Architecture
- Key Features
- Tech Stack
- Getting Started
- Docker Deployment
- Environment Variables
- API Reference
- Data Model
- AI Agent System
- Orchestration & Execution Plans
- Middleware
- Storage Layer
- Billing & Tiers
- Plugin Marketplace
- Testing
- Project Structure
- License
Overview
Adiuva Cloud API is the FastAPI backend that powers the Adiuva Electron desktop app. It provides LLM-powered chat orchestration, end-to-end encrypted cloud storage, a vector search engine, an encrypted backup system, a plugin marketplace with revenue sharing, and Stripe-based subscription billing across four tiers.
Design Principles
- Never persist user data in plaintext — the database stores only auth, billing, storage metadata, and marketplace data. All user content is E2E encrypted by the client before reaching the server.
- Never expose prompts — system prompts stay server-side; responses are sanitized to strip any leaked prompt fragments.
- Never decrypt user blobs — the backend performs only checksum verification; no decryption keys ever reach the server.
- Stateless request handling — all context comes from the client and JWT; no server-side session state.
- Tier gates enforced server-side — the server always reads the current tier from the database, never trusting client-reported values.
Architecture
┌──────────────┐ ┌────────────────────────────────────────────────────────┐
│ Electron │ │ FastAPI (Uvicorn / Gunicorn) │
│ Desktop App │────▶│ │
│ (Client) │◀────│ Middleware: RateLimit → Sanitizer → CORS → Router │
└──────────────┘ │ │
│ ┌──────────────────┐ ┌────────────────────────────┐ │
│ │ Auth Routes │ │ Chat Routes │ │
│ │ Billing Routes │ │ ↓ │ │
│ │ Storage Routes │ │ Orchestrator (GPT-4o-mini)│ │
│ │ Backup Routes │ │ ↓ classify intent │ │
│ │ Plugin Routes │ │ Agent Registry │ │
│ │ Vector Routes │ │ ↓ │ │
│ │ Plans Routes │ │ TaskAgent | ProjectAgent │ │
│ └──────────────────┘ │ NoteAgent | CheckptAgent │ │
│ │ (GPT-4o + LangChain) │ │
│ └────────────────────────────┘ │
└────────────────────────────────────────────────────────┘
│ │ │
┌────────▼───┐ ┌───────▼───────┐ ┌──▼─────────────┐
│ PostgreSQL │ │ AWS S3 │ │ Pinecone / │
│ (Auth, │ │ (E2E blobs, │ │ Qdrant │
│ Billing, │ │ backups) │ │ (Vectors) │
│ Metadata) │ └───────────────┘ └────────────────┘
└────────────┘
│
┌────────▼───┐
│ Stripe │
│ (Billing, │
│ Connect) │
└────────────┘
Key Features
- LLM-powered orchestration — GPT-4o-mini classifies user intent and routes to the appropriate domain agent.
- 4 specialized AI agents — Tasks (8 tools), Projects (6 tools), Timelines (4 tools), Notes (5 tools), all powered by GPT-4o via LangChain.
- Execution plans & playbooks — Server-side prompt template registry; clients receive only opaque template IDs, never raw prompts.
- E2E encrypted cloud storage — The backend never decrypts user data; SHA-256 checksum verification uses constant-time comparison to prevent timing attacks.
- Cloud vector store — Pinecone or Qdrant with user-isolated namespaces and encrypted blob payloads.
- Encrypted backup system — Tiered storage limits with
If-Modified-Sincesupport for efficient syncing. - Plugin marketplace — Catalog, admin review/approval workflow, security checklist, and 70/30 revenue sharing via Stripe Connect.
- Stripe billing — Four-tier subscription model (Free / Pro / Power / Team) with checkout sessions and full webhook lifecycle handling.
- JWT authentication — Access + refresh tokens with bcrypt password hashing, SHA-256 token hashing, and automatic rotation.
- Prompt IP protection — Sanitizer middleware strips system prompts, reasoning markers, tool schemas, and agent routing metadata from all chat responses.
- Tier-based rate limiting — Sliding-window per-user limiter scaling from 20 to 200 requests/min by subscription tier.
- Zero-trust data model — User content is never stored in plaintext; the database holds only authentication, billing, and metadata records.
- WebSocket streaming — Real-time chat with 30-second heartbeat keep-alive and chunked text delivery.
- Alembic migrations — Versioned schema management with seed data for the plugin marketplace.
- Comprehensive test suite — In-memory SQLite + moto S3 mocks, per-tier test fixtures, and full API coverage without external dependencies.
Tech Stack
| Package | Version | Purpose |
|---|---|---|
fastapi |
≥ 0.115.0 | Web framework |
uvicorn[standard] |
≥ 0.34.0 | ASGI development server |
gunicorn |
≥ 22.0.0 | Production process manager |
langchain |
≥ 0.3.0 | LLM orchestration framework |
langchain-openai |
≥ 0.3.0 | OpenAI LLM provider integration |
litellm |
≥ 1.50.0 | Universal LLM gateway (100+ providers) |
pydantic |
≥ 2.10.0 | Data validation and serialization |
pydantic-settings |
≥ 2.7.0 | Environment-based configuration |
python-jose[cryptography] |
≥ 3.3.0 | JWT encoding and decoding |
stripe |
≥ 11.0.0 | Billing and payment integration |
boto3 |
≥ 1.35.0 | AWS S3 client |
slowapi |
≥ 0.1.9 | Rate limiting utilities |
sqlalchemy |
≥ 2.0.0 | Async ORM and query builder |
asyncpg |
≥ 0.30.0 | PostgreSQL async driver |
alembic |
≥ 1.14.0 | Database migration management |
bcrypt |
≥ 4.2.0 | Password hashing |
python-dotenv |
≥ 1.0.0 | .env file loading |
httpx |
≥ 0.28.0 | Async HTTP client (used in tests) |
websockets |
≥ 14.0 | WebSocket protocol support |
psycopg2-binary |
≥ 2.9.0 | Synchronous PostgreSQL driver (Alembic) |
pinecone |
≥ 5.0.0 | Pinecone vector store client |
qdrant-client |
≥ 1.7.0 | Qdrant vector store client |
pytest |
≥ 8.0.0 | Test framework |
pytest-asyncio |
≥ 0.24.0 | Async test support |
aiosqlite |
≥ 0.20.0 | In-memory SQLite for tests |
moto[s3] |
≥ 5.0.0 | AWS S3 mock for tests |
ruff |
≥ 0.8.0 | Linter and formatter |
Getting Started
Prerequisites
- Python 3.12+
- PostgreSQL 16+
- An OpenAI API key (for LLM features)
- Stripe API keys (optional — billing stubs gracefully when unconfigured)
- AWS credentials (optional — needed for S3 storage in production)
Installation
# Clone the repository
git clone <repo-url> && cd adiuva-api
# Create a virtual environment
python -m venv .venv && source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your DATABASE_URL, OPENAI_API_KEY, etc.
Database Setup
# Start PostgreSQL (or use the Docker Compose database)
docker compose up db -d
# Run migrations
alembic upgrade head
Run the Development Server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
Interactive API docs are available at http://localhost:8000/docs in development mode (ENV=dev). The /docs endpoint is disabled in production.
Docker Deployment
Quick Start
docker compose up --build
This starts two services:
- app — FastAPI server on port
8000 - db — PostgreSQL 16 (Alpine) on port
5432with a persistent volume and health checks
The compose file also includes optional services for fully local deployments:
- minio — S3-compatible object storage on ports
9000(API) and9001(console) - qdrant — Vector search engine on ports
6333(HTTP) and6334(gRPC)
Dockerfile Details
The Dockerfile uses a multi-stage build:
- Builder stage — Installs Python dependencies into a virtual environment.
- Runtime stage — Copies only the venv, app source, and Alembic migrations. Runs as a non-root user (
appuser). - Production server — Gunicorn with 4 Uvicorn workers, 120-second timeout, listening on port 8000.
# Production command (run by the container)
gunicorn app.main:app -k uvicorn.workers.UvicornWorker -w 4 --timeout 120 -b 0.0.0.0:8000
Homelab / Self-Hosted Deployment
You can run the entire stack locally on a homelab with no cloud dependencies except the LLM provider. The compose file includes MinIO (S3 replacement) and Qdrant (vector store) out of the box.
1. Start all services
docker compose up -d
This starts PostgreSQL, MinIO, and Qdrant alongside the app.
2. Create the MinIO bucket
Open the MinIO console at http://localhost:9001 (login: minioadmin / minioadmin) and create a bucket named adiuva, or use the CLI:
docker compose exec minio mc alias set local http://localhost:9000 minioadmin minioadmin
docker compose exec minio mc mb local/adiuva
3. Configure your .env
# Database (uses the compose PostgreSQL)
DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/adiuva
# S3 → MinIO
S3_BUCKET=adiuva
S3_REGION=us-east-1
S3_ENDPOINT_URL=http://minio:9000
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
# Vector store → local Qdrant (leave PINECONE_API_KEY empty)
QDRANT_URL=http://qdrant:6333
QDRANT_API_KEY=
PINECONE_API_KEY=
# Billing — leave empty to stub (no Stripe needed)
STRIPE_SECRET_KEY=
STRIPE_WEBHOOK_SECRET=
# LLM — the only external service
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o
LLM_ROUTER_MODEL=gpt-4o-mini
# Auth
JWT_SECRET=your-secret-here
ENV=dev
4. Run migrations
docker compose exec app alembic upgrade head
What runs where
| Service | Runs on | Port | Notes |
|---|---|---|---|
| FastAPI app | Docker | 8000 | API server |
| PostgreSQL | Docker | 5432 | Auth, billing, metadata |
| MinIO | Docker | 9000 / 9001 | S3-compatible blob & backup storage |
| Qdrant | Docker | 6333 / 6334 | Vector search (replaces Pinecone) |
| Stripe | — | — | Stubbed when keys are empty |
| OpenAI / LLM | Cloud | — | Only external dependency |
Want fully offline AI too? Set
LLM_MODEL=ollama/llama3andLLM_ROUTER_MODEL=ollama/llama3, then add an Ollama container or point at a local Ollama instance. See the LLM provider switching section.
Environment Variables
All variables are loaded from a .env file via Pydantic Settings. Source: app/config/settings.py
| Variable | Type | Default | Description |
|---|---|---|---|
DATABASE_URL |
str |
postgresql+asyncpg://postgres:postgres@localhost:5432/adiuva |
Async SQLAlchemy connection string |
JWT_SECRET |
str |
change-me-in-production |
HMAC secret for JWT signing |
JWT_ALGORITHM |
str |
HS256 |
JWT signing algorithm |
JWT_ACCESS_TOKEN_EXPIRE_MINUTES |
int |
30 |
Access token time-to-live |
JWT_REFRESH_TOKEN_EXPIRE_DAYS |
int |
30 |
Refresh token time-to-live |
STRIPE_SECRET_KEY |
str |
"" |
Stripe API key (empty = stub mode) |
STRIPE_WEBHOOK_SECRET |
str |
"" |
Stripe webhook signature secret |
S3_BUCKET |
str |
"" |
S3 bucket for encrypted blobs and backups |
S3_REGION |
str |
us-east-1 |
AWS region |
S3_ENDPOINT_URL |
str |
"" |
Custom S3 endpoint (e.g. http://minio:9000 for MinIO). Leave empty for AWS. |
AWS_ACCESS_KEY_ID |
str |
"" |
AWS credentials |
AWS_SECRET_ACCESS_KEY |
str |
"" |
AWS credentials |
PINECONE_API_KEY |
str |
"" |
Pinecone API key (if set, Pinecone is used for vectors) |
PINECONE_INDEX |
str |
adiuva |
Pinecone index name |
QDRANT_URL |
str |
"" |
Qdrant URL (used when Pinecone is not configured) |
QDRANT_API_KEY |
str |
"" |
Qdrant API key |
OPENAI_API_KEY |
str |
"" |
OpenAI key for LLM agent calls |
LLM_MODEL |
str |
gpt-4o |
LiteLLM model identifier for agents (e.g. anthropic/claude-3.5-sonnet, gemini/gemini-pro, ollama/llama3) |
LLM_ROUTER_MODEL |
str |
gpt-4o-mini |
Lighter model used for intent classification / routing |
CORS_ORIGINS |
list[str] |
["app://.", "http://localhost:3000", "http://localhost:5173"] |
Allowed CORS origins |
ENV |
Literal |
dev |
dev or prod — controls /docs visibility and SQL echo |
API Reference
All routes are prefixed with /api/v1. 27 endpoints total (25 REST + 1 WebSocket + 1 health check).
Health
| Method | Path | Auth | Description |
|---|---|---|---|
GET |
/api/v1/health |
No | Returns {"status": "ok", "version": "0.1.0"} |
Auth
| Method | Path | Auth | Description |
|---|---|---|---|
POST |
/api/v1/auth/register |
No | Create account with bcrypt-hashed password, returns AuthTokens |
POST |
/api/v1/auth/login |
No | Validate credentials, returns AuthTokens |
POST |
/api/v1/auth/refresh |
No | Rotate refresh token, returns new AuthTokens |
GET |
/api/v1/auth/me |
JWT | Returns UserProfile for the authenticated user |
Chat
| Method | Path | Auth | Description |
|---|---|---|---|
POST |
/api/v1/chat |
JWT | Route message through the orchestrator; returns ChatResponse or ExecutionPlan depending on execution mode |
WS |
/api/v1/chat/stream |
JWT (query param ?token=) |
Streaming chat — first frame is a ChatRequest, server yields text chunks, final frame is {"done": true, "response": "...", "actions": [...]}. 30-second heartbeat ping. |
Plans
| Method | Path | Auth | Description |
|---|---|---|---|
GET |
/api/v1/plans/playbook |
JWT | List all cached execution plan playbooks |
GET |
/api/v1/plans/playbook/{plan_id} |
JWT | Retrieve a specific playbook by ID |
Storage (Cloud Records)
| Method | Path | Auth | Description |
|---|---|---|---|
POST |
/api/v1/storage/records |
JWT | Upload an E2E encrypted record (verifies checksum, enforces storage quota) |
GET |
/api/v1/storage/records |
JWT | List record metadata with pagination (?table, ?page, ?limit); no blob bytes returned |
GET |
/api/v1/storage/records/{id} |
JWT | Download encrypted blob with X-Checksum response header |
PUT |
/api/v1/storage/records/{id} |
JWT | Replace an existing blob (verifies checksum, enforces quota) |
DELETE |
/api/v1/storage/records/{id} |
JWT | Delete a record and its S3 blob |
Vectors (Cloud Vector Store)
| Method | Path | Auth | Description |
|---|---|---|---|
POST |
/api/v1/storage/vectors/upsert |
JWT | Verify checksums and upsert encrypted vectors |
POST |
/api/v1/storage/vectors/search |
JWT | Search user-scoped vector namespace |
DELETE |
/api/v1/storage/vectors |
JWT | Delete vectors by ID list |
Backup
| Method | Path | Auth | Description |
|---|---|---|---|
PUT |
/api/v1/backup |
JWT | Upload encrypted backup blob with custom headers (X-Backup-Version, X-Backup-Timestamp, X-Backup-Checksum). Tier quota enforced. |
GET |
/api/v1/backup |
JWT | Download latest backup blob. Supports If-Modified-Since. |
GET |
/api/v1/backup/history |
JWT | List backup metadata (no blob content) |
DELETE |
/api/v1/backup/{backup_id} |
JWT | Delete a specific backup |
Plugins (Marketplace)
| Method | Path | Auth | Description |
|---|---|---|---|
GET |
/api/v1/plugins |
JWT (Power+) | Browse the marketplace (?category, ?q, ?page, ?sort=rating|installs|newest) |
GET |
/api/v1/plugins/{id} |
JWT (Power+) | Plugin detail with install count and ratings |
POST |
/api/v1/plugins/{id}/install |
JWT (Power+) | Install plugin; triggers Stripe Connect revenue split for paid plugins |
DELETE |
/api/v1/plugins/{id}/install |
JWT | Uninstall plugin |
Billing
| Method | Path | Auth | Description |
|---|---|---|---|
POST |
/api/v1/billing/checkout |
JWT | Create a Stripe checkout session, returns {"checkout_url": "..."} |
POST |
/api/v1/billing/webhook |
Stripe signature | Handle Stripe events: checkout.session.completed, customer.subscription.updated, customer.subscription.deleted, invoice.payment_failed |
GET |
/api/v1/billing/subscription |
JWT | Get current subscription information |
DELETE |
/api/v1/billing/subscription |
JWT | Cancel subscription and revert to free tier |
Data Model
9 tables managed by Alembic migrations. Source: app/models.py
Tables
| Table | Primary Key | Key Columns | Purpose |
|---|---|---|---|
users |
id (UUID) |
email (unique), password_hash, tier, stripe_customer_id, timestamps |
User accounts |
refresh_tokens |
id (UUID) |
user_id (FK), token_hash (SHA-256, unique), expires_at |
Hashed refresh tokens for rotation |
subscriptions |
id (UUID) |
user_id (FK, unique), stripe_subscription_id, tier, status, current_period_end |
Stripe subscription records |
storage_records |
id (UUID) |
user_id (FK), table_name, s3_key, checksum, size_bytes, timestamps |
S3 blob metadata (no plaintext content) |
backup_metadata |
id (UUID) |
user_id (FK), s3_key, version, timestamp, checksum, size_bytes |
Backup manifests |
plugins |
id (String) |
name, description, version, author_id (FK), category, price_cents, permissions (JSON), status, s3_package_key, install_count, avg_rating |
Marketplace plugin catalog |
plugin_installations |
id (UUID) |
plugin_id (FK), user_id (FK), unique constraint on (plugin_id, user_id) |
Per-user install tracking |
plugin_reviews |
id (UUID) |
plugin_id (FK), reviewer_id (FK), decision, notes, reviewed_at |
Admin review decisions |
revenue_events |
id (UUID) |
plugin_id (FK), user_id (FK), amount_cents, developer_share_cents, stripe_transfer_id |
70/30 revenue split ledger |
Enum Types
| Enum | Values |
|---|---|
billing_tier |
free, pro, power, team |
plugin_status |
pending_review, approved, rejected |
review_decision |
approved, rejected |
Migrations
| Version | Description |
|---|---|
001_initial_schema |
Creates all 9 tables with indexes and foreign key constraints |
002_seed_plugins |
Seeds 3 approved plugins: GitHub Sync (free), Slack Notifier (€4.99), Time Tracker (€9.99) |
AI Agent System
The agent system uses a registry pattern with LangChain tool-calling agents powered by GPT-4o. Source: app/agents/, app/core/agent_registry.py
Architecture
BaseAgent— Abstract base withuser_id,shared_memory, andvector_store_context.ChatAgent(BaseAgent)— Abstracthandle(query, context)andget_tools()methods, plus a shared_tool_loop(llm, messages, tools, max_iter=5)for iterative tool calling.AgentRegistry— Singleton registry with@registerdecorator,get(name),list_agents(), andcall_agent(name, query, context).
Registered Agents
| Agent | Registry Name | Tools | Description |
|---|---|---|---|
| TaskAgent | task_agent |
8 | Full task and comment CRUD. Status: todo / in_progress / done. Priority: high / medium / low. Tools: list_tasks, create_task, update_task, delete_task, list_tasks_due_today, list_task_comments, add_task_comment, delete_task_comment |
| ProjectAgent | project_agent |
6 | Project lifecycle management. Status: active / archived. Prefers archiving over deletion. Tools: list_projects, list_all_projects, get_project, create_project, update_project, delete_project |
| TimelineAgent | timeline_agent |
4 | Project milestones. Requires project_id for creation. Supports AI-suggestion and approval workflows. Tools: list_timelines, create_timeline, update_timeline, delete_timeline |
| NoteAgent | note_agent |
5 | Markdown note management. Optionally linked to projects. Tools: list_notes, get_note, create_note, update_note, delete_note |
All agents use the model configured by LLM_MODEL (default: GPT-4o) with temperature=0 via LiteLLM. Tools return JSON action descriptors that the Electron client interprets and applies locally.
Switching LLM Providers
The backend uses LiteLLM as a universal LLM gateway. All agents and the orchestrator instantiate models through a centralized factory in app/core/llm.py. To switch providers, change environment variables — no code changes required:
# OpenAI (default)
LLM_MODEL=gpt-4o
LLM_ROUTER_MODEL=gpt-4o-mini
# Anthropic
LLM_MODEL=anthropic/claude-3.5-sonnet
LLM_ROUTER_MODEL=anthropic/claude-3-haiku
# Google Gemini
LLM_MODEL=gemini/gemini-pro
LLM_ROUTER_MODEL=gemini/gemini-flash
# Local Ollama
LLM_MODEL=ollama/llama3
LLM_ROUTER_MODEL=ollama/llama3
# AWS Bedrock
LLM_MODEL=bedrock/anthropic.claude-v2
LLM_ROUTER_MODEL=bedrock/anthropic.claude-instant-v1
See the LiteLLM provider docs for the full list of 100+ supported providers and model naming conventions.
Orchestration & Execution Plans
Source: app/core/orchestrator.py, app/core/execution_plan.py
Orchestrator
classify_intent(message, context, registry)— Uses the router model (LLM_ROUTER_MODEL, default: GPT-4o-mini) to determine which agent should handle a message. Falls back totask_agentwhen classification is ambiguous.route_single(agent_name, message, context)— Routes to a single agent and returns aChatResponse.route_pipeline(agent_names, message, context)— Executes agents sequentially; each receivesprevious_resultsfrom earlier agents. A final LLM synthesis step merges all results.orchestrate(request)— Main entry point. Indirectmode, returns aChatResponse. Inplanmode, returns anExecutionPlan.orchestrate_stream(request)— Streaming variant that yields 50-character text chunks with a final JSON frame.
Execution Plans
PromptTemplateRegistry— Maps template IDs to server-side prompt text. Clients only ever see opaque IDs, never raw prompts.ExecutionPlanBuilder— Fluent builder API:add_step(),add_llm_step(template_id, vars),add_data_step(action, data_from_step). Validates step references onbuild().PlanCache— LRU cache (maxsize 1000) for storing plans as reusable playbooks.
Built-in Templates (6)
tpl_task_agent_default, tpl_timeline_agent_default, tpl_project_agent_default, tpl_note_agent_default, tpl_task_extract_from_project, tpl_note_weekly_summary
Built-in Playbooks (2)
| Playbook | Description |
|---|---|
create_tasks_from_project |
LLM extracts actionable tasks from project context, then creates task records |
generate_weekly_note |
LLM generates a weekly summary, then creates a note record |
Middleware
Middleware executes in this order on each request: TierRateLimit → Sanitizer → CORS → Router
JWT Authentication
Source: app/api/middleware/auth.py
- FastAPI dependency
get_current_uservalidates theBearerJWT and extractsuser_idandemail. - Live tier lookup — The current tier is fetched from the
subscriptionstable on every request (not cached in the JWT), so upgrades and downgrades take immediate effect. - Falls back to
freewhen no subscription row exists. - Raises
401 Unauthorizedon invalid or expired tokens. - Exempt paths:
/api/v1/auth/register,/api/v1/auth/login,/api/v1/billing/webhook
Tier-Based Rate Limiter
Source: app/api/middleware/rate_limit.py
TierRateLimitMiddleware— Sliding-window in-process rate limiter (no Redis dependency).- Per-user 60-second window sized by subscription tier:
| Tier | Requests / Minute |
|---|---|
| Free | 20 |
| Pro | 60 |
| Power | 120 |
| Team | 200 |
- Returns
429 Too Many Requestswith aRetry-Afterheader when the limit is exceeded. - Exempt paths: register, login, webhook, health
Response Sanitizer
Source: app/api/middleware/sanitizer.py
- Runs only on
/api/v1/chatendpoints. - Scans JSON response bodies and replaces leaked prompt IP fragments with
[REDACTED]. - Detects: system prompt openers, agent routing metadata, LangChain tool schemas, internal reasoning markers (
<thinking>,[INST]), and known prompt fingerprints. - Logs sanitization events as
WARNING. - Binary responses (storage, backup) are never touched.
Storage Layer
Blob Store
Source: app/storage/blob_store.py
- S3-backed storage for E2E encrypted blobs.
- Object keys follow the pattern:
{user_id}/{table}/{record_id} - Server-side SSE-S3 encryption at rest (additional layer on top of client-side E2E encryption).
- Methods:
upload(),download(),delete()(idempotent),list_keys() - The backend never inspects or decrypts blob content.
Vector Store
Source: app/storage/vector_store.py
- Runtime-configurable: Pinecone (when
PINECONE_API_KEYis set) or Qdrant (fallback). - User isolation: Pinecone uses
namespace=user_id; Qdrant filters byuser_idpayload field. - 32-dimensional SHA-256-derived float vectors (deterministic, not semantically meaningful on encrypted data — a documented trade-off for privacy).
- Encrypted blobs are stored as base64 in metadata/payload for verbatim retrieval.
- Methods:
upsert(),search(),delete()
Encryption Utilities
Source: app/storage/encryption.py
verify_checksum(blob, checksum)— SHA-256 hash comparison usinghmac.compare_digest(constant-time to prevent timing attacks).reject_if_tampered(blob, checksum)— Raises HTTP 400 on checksum mismatch.- No decryption key ever reaches the backend.
Billing & Tiers
Source: app/billing/stripe_service.py, app/billing/tier_manager.py
Feature Matrix
| Feature | Free | Pro | Power | Team |
|---|---|---|---|---|
| AI Agents | 3 | Unlimited | Unlimited | Unlimited |
| Batch Active | 2 | 10 | Unlimited | Unlimited |
| Cloud Storage | 0 GB | 5 GB | 25 GB | Unlimited |
| Backup Storage | 0 GB | 5 GB | 25 GB | Unlimited |
| LLM Providers | 1 | Unlimited | Unlimited | Unlimited |
| Batch Builder | — | — | ✓ | ✓ |
| Plugin Marketplace | — | — | ✓ | ✓ |
| SSO | — | — | — | ✓ |
| Rate Limit | 20 req/min | 60 req/min | 120 req/min | 200 req/min |
Stripe Integration
- Checkout —
create_checkout_session(user_id, tier)creates a Stripe Checkout session. Returns a stub URL when Stripe is not configured. - Webhooks — Handles
checkout.session.completed,customer.subscription.updated,customer.subscription.deleted, andinvoice.payment_failed. - Subscription management —
get_subscription()returns the current subscription record;cancel_subscription()cancels via the Stripe API and reverts the user to the free tier. - Price IDs:
price_pro_monthly,price_power_monthly,price_team_monthly
Tier Manager
get_tier(user_id)— Returns the user's current billing tier.check_feature(tier, feature)— Boolean feature gate check.require_feature(tier, feature)— Raises HTTP 403 if the feature is not available.enforce_quota(user_id, tier)/enforce_backup_quota(user_id, tier)— Raises HTTP 402 if storage limits are exceeded.
Plugin Marketplace
Source: app/marketplace/
Plugin Registry
- PostgreSQL-backed catalog of submitted and approved plugins.
list_plugins(db, category, query, page, sort)— Paginated listing (page size: 20) with optional filtering by category, text search, and sorting byrating,installs, ornewest.get_plugin(db, plugin_id)— Full manifest with install count and ratings.submit_plugin(db, manifest, s3_key)— Submits a plugin withpending_reviewstatus.approve_plugin()/reject_plugin(reason)— Admin workflow for plugin approval.record_install()/record_uninstall()— Tracks per-user installations and updates install counts.
Review Queue
- Automated security checklist before human review:
- Plugin ID must match
^[a-z0-9-]+$ - Permissions must be from the allowed set only
- No binary blobs in the manifest
- Plugin ID must match
- Allowed permissions:
read:tasks,write:tasks,read:projects,write:projects,read:notes,write:notes,read:timelines,write:timelines,read:calendar,write:calendar get_pending(db)— Lists plugins awaiting review.submit_review(db, plugin_id, reviewer_id, decision, notes)— Records the review decision.
Revenue Sharing
- 70% developer / 30% platform split on all paid plugin sales.
record_install(db, plugin_id, user_id, amount_cents)— Records the revenue event and triggers a Stripe Connect transfer for the developer share.get_earnings(db, developer_id, period)— Aggregated earnings report for plugin developers.- Gracefully stubs transfers when Stripe is not configured.
Seed Plugins
| Plugin | Category | Price |
|---|---|---|
| GitHub Sync | Productivity | Free |
| Slack Notifier | Communication | €4.99 |
| Time Tracker | Productivity | €9.99 |
Testing
Running Tests
# Run all tests
pytest
# Run a specific test file
pytest tests/test_auth.py
# Run with verbose output
pytest -v
Test Infrastructure
- Database: Async SQLite in-memory via
aiosqlite+StaticPool— fast, no PostgreSQL needed. - S3 mock:
moto[s3]with a fixture that patchesBlobStoresettings. - Auth helpers:
make_jwt(tier)andauth_header(tier)generate per-tier test tokens. - Seed data: Auto-creates one
User+Subscriptionper tier (free/pro/power/team) before each test. - Plugin seeds: Fixture adds 3 approved plugins for marketplace tests.
- FK enforcement: SQLite
PRAGMA foreign_keys=ON. - No external dependencies — all tests run fully offline.
Test Coverage
| File | Coverage |
|---|---|
test_auth.py |
Register, login, token access, refresh, expiration |
test_orchestrator.py |
Intent classification, single agent routing, pipeline, plan mode |
test_agents.py |
Each agent with mocked LLM: registration, tools, handle method |
test_storage.py |
Create, list, download, update, delete records; checksum rejection; quota enforcement |
test_backup.py |
Upload, download, history, delete; tier-based storage limits |
test_plugins.py |
List, install, uninstall, revenue events, tier gate enforcement |
test_agent_registry.py |
Registry singleton, registration, lookup, listing |
test_execution_plan.py |
Plan builder, template registry, plan cache |
test_middleware.py |
Rate limiting by tier, sanitizer prompt leak detection |
Project Structure
adiuva-api/
├── alembic.ini # Alembic configuration
├── BACKEND_PLAN.md # Architecture & design decisions
├── docker-compose.yml # Docker Compose (app + PostgreSQL)
├── Dockerfile # Multi-stage production build
├── requirements.txt # Python dependencies
│
├── alembic/ # Database migrations
│ ├── env.py # Alembic environment config
│ ├── script.py.mako # Migration template
│ └── versions/
│ ├── 001_initial_schema.py # Tables, indexes, FKs
│ └── 002_seed_plugins.py # Seed marketplace plugins
│
├── app/ # Application source
│ ├── main.py # FastAPI app factory, middleware, routes
│ ├── db.py # Async SQLAlchemy engine & session
│ ├── models.py # SQLAlchemy ORM models (9 tables)
│ ├── schemas.py # Pydantic request/response schemas
│ │
│ ├── config/
│ │ └── settings.py # Pydantic Settings (env vars)
│ │
│ ├── agents/ # LLM-powered domain agents
│ │ ├── task_agent.py # Task & comment CRUD (8 tools)
│ │ ├── project_agent.py # Project lifecycle (6 tools)
│ │ ├── timeline_agent.py # Milestones (4 tools)
│ │ └── note_agent.py # Markdown notes (5 tools)
│ │
│ ├── core/ # Orchestration engine
│ │ ├── agent_registry.py # BaseAgent, ChatAgent, AgentRegistry
│ │ ├── llm.py # LiteLLM factory (get_llm, get_router_llm)
│ │ ├── orchestrator.py # Intent classification & routing
│ │ └── execution_plan.py # Plan builder, templates, cache
│ │
│ ├── api/ # HTTP layer
│ │ ├── deps.py # Shared FastAPI dependencies
│ │ ├── middleware/
│ │ │ ├── auth.py # JWT validation, live tier lookup
│ │ │ ├── rate_limit.py # Sliding-window tier rate limiter
│ │ │ └── sanitizer.py # Prompt IP leak protection
│ │ └── routes/
│ │ ├── auth.py # Register, login, refresh, me
│ │ ├── chat.py # Chat + WebSocket streaming
│ │ ├── plans.py # Execution plan playbooks
│ │ ├── storage.py # E2E encrypted record CRUD
│ │ ├── vectors.py # Vector upsert, search, delete
│ │ ├── backup.py # Encrypted backup management
│ │ ├── plugins.py # Marketplace browse & install
│ │ └── billing.py # Stripe checkout & webhooks
│ │
│ ├── storage/ # Storage backends
│ │ ├── blob_store.py # S3 blob storage
│ │ ├── vector_store.py # Pinecone / Qdrant vector store
│ │ └── encryption.py # Checksum verification utilities
│ │
│ ├── billing/ # Subscription management
│ │ ├── stripe_service.py # Stripe API integration
│ │ └── tier_manager.py # Feature matrix & quota enforcement
│ │
│ └── marketplace/ # Plugin ecosystem
│ ├── plugin_registry.py # Catalog CRUD & search
│ ├── plugin_review.py # Security checklist & review queue
│ └── revenue_share.py # 70/30 split & Stripe Connect
│
└── tests/ # Test suite
├── conftest.py # Fixtures: DB, S3, auth, seeds
├── test_auth.py
├── test_orchestrator.py
├── test_agents.py
├── test_storage.py
├── test_backup.py
├── test_plugins.py
├── test_agent_registry.py
├── test_execution_plan.py
└── test_middleware.py
License
To be determined.