update env variables
This commit is contained in:
591
README.md
591
README.md
@@ -1,591 +0,0 @@
|
||||
# AdiuvAI Cloud API
|
||||
|
||||
**AI-powered project management backend with LLM orchestration and subscription billing.**
|
||||
|
||||
Built with FastAPI · Python 3.12 · PostgreSQL · LangChain · Stripe
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Architecture](#architecture)
|
||||
- [Key Features](#key-features)
|
||||
- [Tech Stack](#tech-stack)
|
||||
- [Getting Started](#getting-started)
|
||||
- [Docker Deployment](#docker-deployment)
|
||||
- [Environment Variables](#environment-variables)
|
||||
- [API Reference](#api-reference)
|
||||
- [Data Model](#data-model)
|
||||
- [AI Agent System](#ai-agent-system)
|
||||
- [Orchestration & Execution Plans](#orchestration--execution-plans)
|
||||
- [Middleware](#middleware)
|
||||
- [Billing & Tiers](#billing--tiers)
|
||||
- [Testing](#testing)
|
||||
- [Project Structure](#project-structure)
|
||||
- [License](#license)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
AdiuvAI Cloud API is the FastAPI backend that powers the **AdiuvAI Electron desktop app**. It provides LLM-powered chat orchestration, text embedding generation, and Stripe-based subscription billing across four tiers.
|
||||
|
||||
### Design Principles
|
||||
|
||||
1. **Never expose prompts** — system prompts stay server-side; responses are sanitized to strip any leaked prompt fragments.
|
||||
2. **Stateless request handling** — all context comes from the client and JWT; no server-side session state.
|
||||
3. **Tier gates enforced server-side** — the server always reads the current tier from the database, never trusting client-reported values.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌──────────────┐ ┌────────────────────────────────────────────────────────┐
|
||||
│ Electron │ │ FastAPI (Uvicorn / Gunicorn) │
|
||||
│ Desktop App │────▶│ │
|
||||
│ (Client) │◀────│ Middleware: RateLimit → Sanitizer → CORS → Router │
|
||||
└──────────────┘ │ │
|
||||
│ ┌──────────────────┐ ┌────────────────────────────┐ │
|
||||
│ │ Auth Routes │ │ Chat Routes │ │
|
||||
│ │ Billing Routes │ │ ↓ │ │
|
||||
│ │ Agent Routes │ │ Orchestrator (GPT-4o-mini)│ │
|
||||
│ │ Device WS │ │ ↓ classify intent │ │
|
||||
│ └──────────────────┘ │ Agent Registry │ │
|
||||
│ │ ↓ │ │
|
||||
│ │ TaskAgent | ProjectAgent │ │
|
||||
│ │ NoteAgent | CheckptAgent │ │
|
||||
│ │ (GPT-4o + LangChain) │ │
|
||||
│ └────────────────────────────┘ │
|
||||
└────────────────────────────────────────────────────────┘
|
||||
│
|
||||
┌────────▼───┐
|
||||
│ PostgreSQL │
|
||||
│ (Auth, │
|
||||
│ Billing, │
|
||||
│ Agents) │
|
||||
└────────────┘
|
||||
│
|
||||
┌────────▼───┐
|
||||
│ Stripe │
|
||||
│ (Billing) │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Features
|
||||
|
||||
1. **LLM-powered orchestration** — GPT-4o-mini classifies user intent and routes to the appropriate domain agent.
|
||||
2. **4 specialized AI agents** — Tasks (8 tools), Projects (6 tools), Timelines (4 tools), Notes (5 tools), all powered by GPT-4o via LangChain.
|
||||
3. **Execution plans & playbooks** — Server-side prompt template registry; clients receive only opaque template IDs, never raw prompts.
|
||||
4. **Text embeddings** — Generates text-embedding-3-small vectors for local client-side note search.
|
||||
5. **Stripe billing** — Four-tier subscription model (Free / Pro / Power / Team) with checkout sessions and full webhook lifecycle handling.
|
||||
6. **JWT authentication** — Access + refresh tokens with bcrypt password hashing, SHA-256 token hashing, and automatic rotation.
|
||||
7. **Prompt IP protection** — Sanitizer middleware strips system prompts, reasoning markers, tool schemas, and agent routing metadata from all chat responses.
|
||||
8. **Tier-based rate limiting** — Sliding-window per-user limiter scaling from 20 to 200 requests/min by subscription tier.
|
||||
9. **WebSocket streaming** — Real-time chat with 30-second heartbeat keep-alive and chunked text delivery.
|
||||
10. **Alembic migrations** — Versioned schema management.
|
||||
11. **Comprehensive test suite** — In-memory SQLite, per-tier test fixtures, and full API coverage without external dependencies.
|
||||
|
||||
---
|
||||
|
||||
## Tech Stack
|
||||
|
||||
| Package | Version | Purpose |
|
||||
|---|---|---|
|
||||
| `fastapi` | ≥ 0.115.0 | Web framework |
|
||||
| `uvicorn[standard]` | ≥ 0.34.0 | ASGI development server |
|
||||
| `gunicorn` | ≥ 22.0.0 | Production process manager |
|
||||
| `langchain` | ≥ 0.3.0 | LLM orchestration framework |
|
||||
| `langchain-openai` | ≥ 0.3.0 | OpenAI LLM provider integration |
|
||||
| `litellm` | ≥ 1.50.0 | Universal LLM gateway (100+ providers) |
|
||||
| `pydantic` | ≥ 2.10.0 | Data validation and serialization |
|
||||
| `pydantic-settings` | ≥ 2.7.0 | Environment-based configuration |
|
||||
| `python-jose[cryptography]` | ≥ 3.3.0 | JWT encoding and decoding |
|
||||
| `stripe` | ≥ 11.0.0 | Billing and payment integration |
|
||||
| `slowapi` | ≥ 0.1.9 | Rate limiting utilities |
|
||||
| `sqlalchemy` | ≥ 2.0.0 | Async ORM and query builder |
|
||||
| `asyncpg` | ≥ 0.30.0 | PostgreSQL async driver |
|
||||
| `alembic` | ≥ 1.14.0 | Database migration management |
|
||||
| `bcrypt` | ≥ 4.2.0 | Password hashing |
|
||||
| `python-dotenv` | ≥ 1.0.0 | `.env` file loading |
|
||||
| `httpx` | ≥ 0.28.0 | Async HTTP client (used in tests) |
|
||||
| `websockets` | ≥ 14.0 | WebSocket protocol support |
|
||||
| `psycopg2-binary` | ≥ 2.9.0 | Synchronous PostgreSQL driver (Alembic) |
|
||||
| `pytest` | ≥ 8.0.0 | Test framework |
|
||||
| `pytest-asyncio` | ≥ 0.24.0 | Async test support |
|
||||
| `aiosqlite` | ≥ 0.20.0 | In-memory SQLite for tests |
|
||||
| `ruff` | ≥ 0.8.0 | Linter and formatter |
|
||||
|
||||
---
|
||||
|
||||
## Getting Started
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Python 3.12+
|
||||
- PostgreSQL 16+
|
||||
- An OpenAI API key (for LLM features)
|
||||
- Stripe API keys (optional — billing stubs gracefully when unconfigured)
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone <repo-url> && cd adiuvai-api
|
||||
|
||||
# Create a virtual environment
|
||||
python -m venv .venv && source .venv/bin/activate
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Configure environment
|
||||
cp .env.example .env
|
||||
# Edit .env with your DATABASE_URL, OPENAI_API_KEY, etc.
|
||||
```
|
||||
|
||||
### Database Setup
|
||||
|
||||
```bash
|
||||
# Start PostgreSQL (or use the Docker Compose database)
|
||||
docker compose up db -d
|
||||
|
||||
# Run migrations
|
||||
alembic upgrade head
|
||||
```
|
||||
|
||||
### Run the Development Server
|
||||
|
||||
```bash
|
||||
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
Interactive API docs are available at [http://localhost:8000/docs](http://localhost:8000/docs) in development mode (`ENV=dev`). The `/docs` endpoint is disabled in production.
|
||||
|
||||
---
|
||||
|
||||
## Docker Deployment
|
||||
|
||||
### Quick Start
|
||||
|
||||
```bash
|
||||
docker compose up --build
|
||||
```
|
||||
|
||||
This starts two services:
|
||||
|
||||
- **app** — FastAPI server on port `8000`
|
||||
- **db** — PostgreSQL 16 (Alpine) on port `5432` with a persistent volume and health checks
|
||||
|
||||
### Dockerfile Details
|
||||
|
||||
The Dockerfile uses a multi-stage build:
|
||||
|
||||
1. **Builder stage** — Installs Python dependencies into a virtual environment.
|
||||
2. **Runtime stage** — Copies only the venv, app source, and Alembic migrations. Runs as a non-root user (`appuser`).
|
||||
3. **Production server** — Gunicorn with 4 Uvicorn workers, 120-second timeout, listening on port 8000.
|
||||
|
||||
```bash
|
||||
# Production command (run by the container)
|
||||
gunicorn app.main:app -k uvicorn.workers.UvicornWorker -w 4 --timeout 120 -b 0.0.0.0:8000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Homelab / Self-Hosted Deployment
|
||||
|
||||
You can run the entire stack locally on a homelab with **no cloud dependencies except the LLM provider**.
|
||||
|
||||
### 1. Start all services
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
This starts PostgreSQL alongside the app.
|
||||
|
||||
### 2. Configure your `.env`
|
||||
|
||||
```bash
|
||||
# Database (uses the compose PostgreSQL)
|
||||
DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/adiuvai
|
||||
|
||||
# Billing — leave empty to stub (no Stripe needed)
|
||||
STRIPE_SECRET_KEY=
|
||||
STRIPE_WEBHOOK_SECRET=
|
||||
|
||||
# LLM — the only external service
|
||||
OPENAI_API_KEY=sk-...
|
||||
LLM_MODEL=gpt-4o
|
||||
LLM_ROUTER_MODEL=gpt-4o-mini
|
||||
|
||||
# Auth
|
||||
JWT_SECRET=your-secret-here
|
||||
ENV=dev
|
||||
```
|
||||
|
||||
### 3. Run migrations
|
||||
|
||||
```bash
|
||||
docker compose exec app alembic upgrade head
|
||||
```
|
||||
|
||||
### What runs where
|
||||
|
||||
| Service | Runs on | Port | Notes |
|
||||
|---|---|---|---|
|
||||
| FastAPI app | Docker | 8000 | API server |
|
||||
| PostgreSQL | Docker | 5432 | Auth, billing, agents |
|
||||
| Stripe | — | — | Stubbed when keys are empty |
|
||||
| OpenAI / LLM | Cloud | — | Only external dependency |
|
||||
|
||||
> **Want fully offline AI too?** Set `LLM_MODEL=ollama/llama3` and `LLM_ROUTER_MODEL=ollama/llama3`, then add an Ollama container or point at a local Ollama instance. See the [LLM provider switching](#switching-llm-providers) section.
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
All variables are loaded from a `.env` file via Pydantic Settings. Source: `app/config/settings.py`
|
||||
|
||||
| Variable | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `DATABASE_URL` | `str` | `postgresql+asyncpg://postgres:postgres@localhost:5432/adiuvai` | Async SQLAlchemy connection string |
|
||||
| `JWT_SECRET` | `str` | `change-me-in-production` | HMAC secret for JWT signing |
|
||||
| `JWT_ALGORITHM` | `str` | `HS256` | JWT signing algorithm |
|
||||
| `JWT_ACCESS_TOKEN_EXPIRE_MINUTES` | `int` | `30` | Access token time-to-live |
|
||||
| `JWT_REFRESH_TOKEN_EXPIRE_DAYS` | `int` | `30` | Refresh token time-to-live |
|
||||
| `STRIPE_SECRET_KEY` | `str` | `""` | Stripe API key (empty = stub mode) |
|
||||
| `STRIPE_WEBHOOK_SECRET` | `str` | `\"\"` | Stripe webhook signature secret |\n| `OPENAI_API_KEY` | `str` | `\"\"` | OpenAI key for LLM agent calls |
|
||||
| `LLM_MODEL` | `str` | `gpt-4o` | LiteLLM model identifier for agents (e.g. `anthropic/claude-3.5-sonnet`, `gemini/gemini-pro`, `ollama/llama3`) |
|
||||
| `LLM_ROUTER_MODEL` | `str` | `gpt-4o-mini` | Lighter model used for intent classification / routing |
|
||||
| `CORS_ORIGINS` | `list[str]` | `["app://.", "http://localhost:3000", "http://localhost:5173"]` | Allowed CORS origins |
|
||||
| `ENV` | `Literal` | `dev` | `dev` or `prod` — controls `/docs` visibility and SQL echo |
|
||||
|
||||
---
|
||||
|
||||
## API Reference
|
||||
|
||||
All routes are prefixed with `/api/v1`. **27 endpoints** total (25 REST + 1 WebSocket + 1 health check).
|
||||
|
||||
### Health
|
||||
|
||||
| Method | Path | Auth | Description |
|
||||
|---|---|---|---|
|
||||
| `GET` | `/api/v1/health` | No | Returns `{"status": "ok", "version": "0.1.0"}` |
|
||||
|
||||
### Auth
|
||||
|
||||
| Method | Path | Auth | Description |
|
||||
|---|---|---|---|
|
||||
| `POST` | `/api/v1/auth/register` | No | Create account with bcrypt-hashed password, returns `AuthTokens` |
|
||||
| `POST` | `/api/v1/auth/login` | No | Validate credentials, returns `AuthTokens` |
|
||||
| `POST` | `/api/v1/auth/refresh` | No | Rotate refresh token, returns new `AuthTokens` |
|
||||
| `GET` | `/api/v1/auth/me` | JWT | Returns `UserProfile` for the authenticated user |
|
||||
|
||||
### Chat
|
||||
|
||||
| Method | Path | Auth | Description |
|
||||
|---|---|---|---|
|
||||
| `POST` | `/api/v1/chat` | JWT | Route message through the orchestrator; returns `ChatResponse` or `ExecutionPlan` depending on execution mode |
|
||||
| `POST` | `/api/v1/chat/embed` | JWT | Generate a 1536-dim text embedding vector (`text-embedding-3-small`). Used by Electron for local note search. |
|
||||
| `WS` | `/api/v1/chat/stream` | JWT (query param `?token=`) | Streaming chat — first frame is a `ChatRequest`, server yields text chunks, final frame is `{"done": true, "response": "...", "actions": [...]}`. 30-second heartbeat ping. |
|
||||
|
||||
### Plans
|
||||
|
||||
| Method | Path | Auth | Description |
|
||||
|---|---|---|---|
|
||||
| `GET` | `/api/v1/plans/playbook` | JWT | List all cached execution plan playbooks |
|
||||
| `GET` | `/api/v1/plans/playbook/{plan_id}` | JWT | Retrieve a specific playbook by ID |
|
||||
|
||||
### Billing
|
||||
|
||||
| Method | Path | Auth | Description |
|
||||
|---|---|---|---|
|
||||
| `POST` | `/api/v1/billing/checkout` | JWT | Create a Stripe checkout session, returns `{"checkout_url": "..."}` |
|
||||
| `POST` | `/api/v1/billing/webhook` | Stripe signature | Handle Stripe events: `checkout.session.completed`, `customer.subscription.updated`, `customer.subscription.deleted`, `invoice.payment_failed` |
|
||||
| `GET` | `/api/v1/billing/subscription` | JWT | Get current subscription information |
|
||||
| `DELETE` | `/api/v1/billing/subscription` | JWT | Cancel subscription and revert to free tier |
|
||||
|
||||
---
|
||||
|
||||
## Data Model
|
||||
|
||||
3 tables managed by Alembic migrations. Source: `app/models.py`
|
||||
|
||||
### Tables
|
||||
|
||||
| Table | Primary Key | Key Columns | Purpose |
|
||||
|---|---|---|---|
|
||||
| `users` | `id` (UUID) | `email` (unique), `password_hash`, `tier`, `stripe_customer_id`, timestamps | User accounts |
|
||||
| `refresh_tokens` | `id` (UUID) | `user_id` (FK), `token_hash` (SHA-256, unique), `expires_at` | Hashed refresh tokens for rotation |
|
||||
| `subscriptions` | `id` (UUID) | `user_id` (FK, unique), `stripe_subscription_id`, `tier`, `status`, `current_period_end` | Stripe subscription records |
|
||||
|
||||
### Enum Types
|
||||
|
||||
| Enum | Values |
|
||||
|---|---|
|
||||
| `billing_tier` | `free`, `pro`, `power`, `team` |
|
||||
|
||||
### Migrations
|
||||
|
||||
| Version | Description |
|
||||
|---|---|
|
||||
| `001_initial_schema` | Creates core auth and billing tables with indexes and foreign key constraints |
|
||||
|
||||
---
|
||||
|
||||
## AI Agent System
|
||||
|
||||
The agent system uses a registry pattern with LangChain tool-calling agents powered by GPT-4o. Source: `app/agents/`, `app/core/agent_registry.py`
|
||||
|
||||
### Architecture
|
||||
|
||||
- **`BaseAgent`** — Abstract base with `user_id` and `shared_memory`.
|
||||
- **`ChatAgent(BaseAgent)`** — Abstract `handle(query, context)` and `get_tools()` methods, plus a shared `_tool_loop(llm, messages, tools, max_iter=5)` for iterative tool calling.
|
||||
- **`AgentRegistry`** — Singleton registry with `@register` decorator, `get(name)`, `list_agents()`, and `call_agent(name, query, context)`.
|
||||
|
||||
### Registered Agents
|
||||
|
||||
| Agent | Registry Name | Tools | Description |
|
||||
|---|---|---|---|
|
||||
| **TaskAgent** | `task_agent` | 8 | Full task and comment CRUD. Status: `todo` / `in_progress` / `done`. Priority: `high` / `medium` / `low`. Tools: `list_tasks`, `create_task`, `update_task`, `delete_task`, `list_tasks_due_today`, `list_task_comments`, `add_task_comment`, `delete_task_comment` |
|
||||
| **ProjectAgent** | `project_agent` | 6 | Project lifecycle management. Status: `active` / `archived`. Prefers archiving over deletion. Tools: `list_projects`, `list_all_projects`, `get_project`, `create_project`, `update_project`, `delete_project` |
|
||||
| **TimelineAgent** | `timeline_agent` | 4 | Project milestones. Requires `project_id` for creation. Supports AI-suggestion and approval workflows. Tools: `list_timelines`, `create_timeline`, `update_timeline`, `delete_timeline` |
|
||||
| **NoteAgent** | `note_agent` | 5 | Markdown note management. Optionally linked to projects. Tools: `list_notes`, `get_note`, `create_note`, `update_note`, `delete_note` |
|
||||
|
||||
All agents use the model configured by `LLM_MODEL` (default: GPT-4o) with `temperature=0` via LiteLLM. Tools return JSON action descriptors that the Electron client interprets and applies locally.
|
||||
|
||||
### Switching LLM Providers
|
||||
|
||||
The backend uses **LiteLLM** as a universal LLM gateway. All agents and the orchestrator instantiate models through a centralized factory in `app/core/llm.py`. To switch providers, change environment variables — no code changes required:
|
||||
|
||||
```bash
|
||||
# OpenAI (default)
|
||||
LLM_MODEL=gpt-4o
|
||||
LLM_ROUTER_MODEL=gpt-4o-mini
|
||||
|
||||
# Anthropic
|
||||
LLM_MODEL=anthropic/claude-3.5-sonnet
|
||||
LLM_ROUTER_MODEL=anthropic/claude-3-haiku
|
||||
|
||||
# Google Gemini
|
||||
LLM_MODEL=gemini/gemini-pro
|
||||
LLM_ROUTER_MODEL=gemini/gemini-flash
|
||||
|
||||
# Local Ollama
|
||||
LLM_MODEL=ollama/llama3
|
||||
LLM_ROUTER_MODEL=ollama/llama3
|
||||
|
||||
# AWS Bedrock
|
||||
LLM_MODEL=bedrock/anthropic.claude-v2
|
||||
LLM_ROUTER_MODEL=bedrock/anthropic.claude-instant-v1
|
||||
```
|
||||
|
||||
See the [LiteLLM provider docs](https://docs.litellm.ai/docs/providers) for the full list of 100+ supported providers and model naming conventions.
|
||||
|
||||
---
|
||||
|
||||
## Orchestration & Execution Plans
|
||||
|
||||
Source: `app/core/orchestrator.py`, `app/core/execution_plan.py`
|
||||
|
||||
### Orchestrator
|
||||
|
||||
1. **`classify_intent(message, context, registry)`** — Uses the router model (`LLM_ROUTER_MODEL`, default: GPT-4o-mini) to determine which agent should handle a message. Falls back to `task_agent` when classification is ambiguous.
|
||||
2. **`route_single(agent_name, message, context)`** — Routes to a single agent and returns a `ChatResponse`.
|
||||
3. **`route_pipeline(agent_names, message, context)`** — Executes agents sequentially; each receives `previous_results` from earlier agents. A final LLM synthesis step merges all results.
|
||||
4. **`orchestrate(request)`** — Main entry point. In `direct` mode, returns a `ChatResponse`. In `plan` mode, returns an `ExecutionPlan`.
|
||||
5. **`orchestrate_stream(request)`** — Streaming variant that yields 50-character text chunks with a final JSON frame.
|
||||
|
||||
### Execution Plans
|
||||
|
||||
- **`PromptTemplateRegistry`** — Maps template IDs to server-side prompt text. Clients only ever see opaque IDs, never raw prompts.
|
||||
- **`ExecutionPlanBuilder`** — Fluent builder API: `add_step()`, `add_llm_step(template_id, vars)`, `add_data_step(action, data_from_step)`. Validates step references on `build()`.
|
||||
- **`PlanCache`** — LRU cache (maxsize 1000) for storing plans as reusable playbooks.
|
||||
|
||||
### Built-in Templates (6)
|
||||
|
||||
`tpl_task_agent_default`, `tpl_timeline_agent_default`, `tpl_project_agent_default`, `tpl_note_agent_default`, `tpl_task_extract_from_project`, `tpl_note_weekly_summary`
|
||||
|
||||
### Built-in Playbooks (2)
|
||||
|
||||
| Playbook | Description |
|
||||
|---|---|
|
||||
| `create_tasks_from_project` | LLM extracts actionable tasks from project context, then creates task records |
|
||||
| `generate_weekly_note` | LLM generates a weekly summary, then creates a note record |
|
||||
|
||||
---
|
||||
|
||||
## Middleware
|
||||
|
||||
Middleware executes in this order on each request: **TierRateLimit → Sanitizer → CORS → Router**
|
||||
|
||||
### JWT Authentication
|
||||
|
||||
Source: `app/api/middleware/auth.py`
|
||||
|
||||
- FastAPI dependency `get_current_user` validates the `Bearer` JWT and extracts `user_id` and `email`.
|
||||
- **Live tier lookup** — The current tier is fetched from the `subscriptions` table on every request (not cached in the JWT), so upgrades and downgrades take immediate effect.
|
||||
- Falls back to `free` when no subscription row exists.
|
||||
- Raises `401 Unauthorized` on invalid or expired tokens.
|
||||
- **Exempt paths:** `/api/v1/auth/register`, `/api/v1/auth/login`, `/api/v1/billing/webhook`
|
||||
|
||||
### Tier-Based Rate Limiter
|
||||
|
||||
Source: `app/api/middleware/rate_limit.py`
|
||||
|
||||
- `TierRateLimitMiddleware` — Sliding-window in-process rate limiter (no Redis dependency).
|
||||
- Per-user 60-second window sized by subscription tier:
|
||||
|
||||
| Tier | Requests / Minute |
|
||||
|---|---|
|
||||
| Free | 20 |
|
||||
| Pro | 60 |
|
||||
| Power | 120 |
|
||||
| Team | 200 |
|
||||
|
||||
- Returns `429 Too Many Requests` with a `Retry-After` header when the limit is exceeded.
|
||||
- **Exempt paths:** register, login, webhook, health
|
||||
|
||||
### Response Sanitizer
|
||||
|
||||
Source: `app/api/middleware/sanitizer.py`
|
||||
|
||||
- Runs only on `/api/v1/chat` endpoints.
|
||||
- Scans JSON response bodies and replaces leaked prompt IP fragments with `[REDACTED]`.
|
||||
- Detects: system prompt openers, agent routing metadata, LangChain tool schemas, internal reasoning markers (`<thinking>`, `[INST]`), and known prompt fingerprints.
|
||||
- Logs sanitization events as `WARNING`.
|
||||
|
||||
---
|
||||
|
||||
## Billing & Tiers
|
||||
|
||||
Source: `app/billing/stripe_service.py`, `app/billing/tier_manager.py`
|
||||
|
||||
### Feature Matrix
|
||||
|
||||
| Feature | Free | Pro | Power | Team |
|
||||
|---|---|---|---|---|
|
||||
| AI Agents | 3 | Unlimited | Unlimited | Unlimited |
|
||||
| Batch Active | 2 | 10 | Unlimited | Unlimited |
|
||||
| LLM Providers | 1 | Unlimited | Unlimited | Unlimited |
|
||||
| Batch Builder | — | — | ✓ | ✓ |
|
||||
| SSO | — | — | — | ✓ |
|
||||
| Rate Limit | 20 req/min | 60 req/min | 120 req/min | 200 req/min |
|
||||
|
||||
### Stripe Integration
|
||||
|
||||
- **Checkout** — `create_checkout_session(user_id, tier)` creates a Stripe Checkout session. Returns a stub URL when Stripe is not configured.
|
||||
- **Webhooks** — Handles `checkout.session.completed`, `customer.subscription.updated`, `customer.subscription.deleted`, and `invoice.payment_failed`.
|
||||
- **Subscription management** — `get_subscription()` returns the current subscription record; `cancel_subscription()` cancels via the Stripe API and reverts the user to the free tier.
|
||||
- **Price IDs:** `price_pro_monthly`, `price_power_monthly`, `price_team_monthly`
|
||||
|
||||
### Tier Manager
|
||||
|
||||
- `get_tier(user_id)` — Returns the user's current billing tier.
|
||||
- `check_feature(tier, feature)` — Boolean feature gate check.
|
||||
- `require_feature(tier, feature)` — Raises HTTP 403 if the feature is not available.
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
pytest
|
||||
|
||||
# Run a specific test file
|
||||
pytest tests/test_auth.py
|
||||
|
||||
# Run with verbose output
|
||||
pytest -v
|
||||
```
|
||||
|
||||
### Test Infrastructure
|
||||
|
||||
- **Database:** Async SQLite in-memory via `aiosqlite` + `StaticPool` — fast, no PostgreSQL needed.
|
||||
- **Auth helpers:** `make_jwt(tier)` and `auth_header(tier)` generate per-tier test tokens.
|
||||
- **Seed data:** Auto-creates one `User` + `Subscription` per tier (free/pro/power/team) before each test.
|
||||
- **FK enforcement:** SQLite `PRAGMA foreign_keys=ON`.
|
||||
- **No external dependencies** — all tests run fully offline.
|
||||
|
||||
### Test Coverage
|
||||
|
||||
| File | Coverage |
|
||||
|---|---|
|
||||
| `test_auth.py` | Register, login, token access, refresh, expiration |
|
||||
| `test_middleware.py` | Rate limiting by tier, sanitizer prompt leak detection |
|
||||
|
||||
---
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
adiuvai-api/
|
||||
├── alembic.ini # Alembic configuration
|
||||
├── docker-compose.yml # Docker Compose (app + PostgreSQL)
|
||||
├── Dockerfile # Multi-stage production build
|
||||
├── requirements.txt # Python dependencies
|
||||
│
|
||||
├── alembic/ # Database migrations
|
||||
│ ├── env.py # Alembic environment config
|
||||
│ ├── script.py.mako # Migration template
|
||||
│ └── versions/
|
||||
│ └── 001_initial_schema.py # Tables, indexes, FKs
|
||||
│
|
||||
├── app/ # Application source
|
||||
│ ├── main.py # FastAPI app factory, middleware, routes
|
||||
│ ├── db.py # Async SQLAlchemy engine & session
|
||||
│ ├── models.py # SQLAlchemy ORM models
|
||||
│ ├── schemas.py # Pydantic request/response schemas
|
||||
│ │
|
||||
│ ├── config/
|
||||
│ │ └── settings.py # Pydantic Settings (env vars)
|
||||
│ │
|
||||
│ ├── agents/ # LLM-powered domain agents
|
||||
│ │ ├── task_agent.py # Task & comment CRUD (8 tools)
|
||||
│ │ ├── project_agent.py # Project lifecycle (6 tools)
|
||||
│ │ ├── timeline_agent.py # Milestones (4 tools)
|
||||
│ │ └── note_agent.py # Markdown notes (5 tools)
|
||||
│ │
|
||||
│ ├── core/ # Orchestration engine
|
||||
│ │ ├── agent_registry.py # BaseAgent, ChatAgent, AgentRegistry
|
||||
│ │ ├── llm.py # LiteLLM factory (get_llm, get_router_llm)
|
||||
│ │ └── deep_agent.py # Deep agent orchestration
|
||||
│ │
|
||||
│ ├── api/ # HTTP layer
|
||||
│ │ ├── deps.py # Shared FastAPI dependencies
|
||||
│ │ ├── middleware/
|
||||
│ │ │ ├── rate_limit.py # Sliding-window tier rate limiter
|
||||
│ │ │ └── sanitizer.py # Prompt IP leak protection
|
||||
│ │ └── routes/
|
||||
│ │ ├── auth.py # Register, login, refresh, me
|
||||
│ │ ├── chat.py # Chat + embed endpoint
|
||||
│ │ ├── billing.py # Stripe checkout, webhooks, subscription
|
||||
│ │ ├── agents.py # Agent catalog, config, runs
|
||||
│ │ └── device_ws.py # Persistent device WebSocket
|
||||
│ │
|
||||
│ └── billing/
|
||||
│ ├── stripe_service.py # Stripe API wrapper
|
||||
│ └── tier_manager.py # Feature matrix, rate limits
|
||||
│
|
||||
└── tests/ # Test suite
|
||||
├── conftest.py # Fixtures: DB, auth, seeds
|
||||
├── test_auth.py
|
||||
├── test_orchestrator.py
|
||||
├── test_agents.py
|
||||
├── test_agent_registry.py
|
||||
├── test_execution_plan.py
|
||||
└── test_middleware.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
*To be determined.*
|
||||
|
||||
Reference in New Issue
Block a user