# RALPH LOOP PROMPT — Memory Subsystem Evolution (MemGPT + Mem0 + Mem0g-light)

> **How to run:**
> ```
> /ralph-loop "Implement the memory evolution exactly as specified in docs/PROMPT-memory-evolution.md. ALWAYS start each iteration by invoking the /caveman:caveman ultra skill at intensity 'full'. Output <promise>MEMORY EVOLUTION COMPLETE</promise> when all phases pass lint + tests." --max-iterations 40 --completion-promise "MEMORY EVOLUTION COMPLETE"
> ```

---

## MANDATORY PER-ITERATION PREAMBLE

**Every iteration MUST begin with these two actions, in order:**

1. **Activate caveman mode.** Invoke the `caveman:caveman ultra` skill at intensity `full` before any other tool call. All prose you emit during the iteration must follow caveman rules (drop articles, fragments OK, no filler, no pleasantries). Code/commits/PRs stay normal per caveman plugin rules.
2. **Read this file in full** (`docs/PROMPT-memory-evolution.md`) to re-anchor on the plan.

If caveman already active from prior iteration, re-assert it anyway — ralph loop restarts cold each time.

After preamble:

3. Inspect repo state: check which tasks already done by reading target files / running grep.
4. Pick next incomplete task in phase order (Phase 1 → 2 → 3 → 4 → 5). No skipping, no out-of-order.
5. Implement task.
6. Run relevant lint + tests for that phase before exit.
7. When ALL phases complete AND lints + tests green → output `<promise>MEMORY EVOLUTION COMPLETE</promise>`.

**DO NOT** implement multiple phases in one iteration unless they are tiny edits in the same file.

---

## LINT + TEST COMMANDS

Run after each phase:
- Backend lint: `cd api && ruff check . --fix`
- Backend tests: `cd api && pytest -q`
- Frontend lint: `cd adiuvAI && npx eslint . --fix`
- Frontend typecheck: `cd adiuvAI && npx tsc --noEmit`

---

## SOURCE OF TRUTH

Architectural rationale lives in [docs/memory-evolution-strategy.md](docs/memory-evolution-strategy.md). This file is the execution plan derived from it. If a conflict appears, the strategy doc wins on *why*, this doc wins on *how*.

**Zero-trust invariant:** all user-content writes/reads go through per-user Fernet in [api/app/core/memory_middleware.py](api/app/core/memory_middleware.py). Backend never stores plaintext user content. Embeddings may leak text to OpenAI — already accepted trade-off, documented in privacy policy.

**Tier gates** live in [api/app/billing/tier_manager.py](api/app/billing/tier_manager.py). New capabilities MUST be gated there, not ad-hoc in routes.

---

## WHAT THIS FEATURE DOES

Five goals from the strategy doc, executed in order:

1. **Activate real pgvector** on `associative` tier (replace keyword fallback). Pro+ only.
2. **Mem0-style Extract/Update pipeline** post-`store_episode`. Batch for Free, realtime for Pro+.
3. **`relational` tier (Mem0g-light)**: new table `memory_relations` — person/project/topic graph in Postgres.
4. **Settings > Memory UI** in Electron renderer — view/edit `core` + `relational`, GDPR forget.
5. **Proactive mining** (Power tier only, optional last): scheduled job promotes episodic patterns to `proactive`.

**Architectural anchors already in place** (do NOT re-create):
- `MemoryMiddleware.enrich_context` injects 4 tiers into orchestrator — extend, not replace.
- `MemoryAssociative.embedding` column exists (JSON fallback); swap to `pgvector.Vector(1536)` in migration.
- `get_llm("gpt-4o-mini", ...)` in [api/app/core/llm.py](api/app/core/llm.py) is canonical LLM factory.
- Tier-gating helper: `TierManager.has_feature(user, feature)` — add new feature enums.

---

## PHASE 1 — pgvector on associative tier (Pro+ gated)

### TASK 1.1: Alembic migration — switch `memory_associative.embedding` to `vector(1536)`

**File:** `api/alembic/versions/XXX_associative_pgvector.py` (new)

Contents:
- `CREATE EXTENSION IF NOT EXISTS vector;` (idempotent).
- `ALTER TABLE memory_associative ALTER COLUMN embedding TYPE vector(1536) USING embedding::text::vector;` — must handle existing JSON rows. If conversion risky, drop column and re-add: `DROP COLUMN embedding; ADD COLUMN embedding vector(1536);` (data loss acceptable — keyword fallback still works).
- Create IVFFlat index: `CREATE INDEX memory_associative_embedding_idx ON memory_associative USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);`
- `downgrade()` reverses: drop index, `ALTER TYPE ... TYPE jsonb`.

Revision id: increment from latest in `api/alembic/versions/`. Check `004_add_memory_tables.py` for style.

**Done signal:** Migration applies cleanly on a fresh DB: `alembic upgrade head` exits 0.

---

### TASK 1.2: Update `MemoryAssociative.embedding` SQLAlchemy column

**File:** [api/app/models.py](api/app/models.py)

Replace:
```python
embedding: Mapped[list | None] = mapped_column(JSON, nullable=True)
```
with:
```python
from pgvector.sqlalchemy import Vector
...
embedding: Mapped[list | None] = mapped_column(Vector(1536), nullable=True)
```

Add `pgvector>=0.2.5` to `api/requirements.txt` (or `pyproject.toml` — check which is authoritative).

**Done signal:** `pgvector` import resolves, `pytest -q` still green on model import.

---

### TASK 1.3: Add `TierFeature.REAL_EMBEDDINGS` feature flag

**File:** `api/app/billing/tier_manager.py`

Add to the feature enum / matrix:
- `REAL_EMBEDDINGS = "real_embeddings"` → granted for `pro`, `power`, `team`. Free = False.

**Done signal:** `TierManager.has_feature(user, "real_embeddings")` returns correct bool per tier.

---

### TASK 1.4: Embedding helper

**File:** `api/app/core/embeddings.py` (new)

```python
async def embed_text(text: str) -> list[float] | None:
    """Call OpenAI text-embedding-3-small. Return None on failure (caller falls back to keyword)."""
```

Use `AsyncOpenAI` client (already a dep via LiteLLM). Truncate input to 8000 chars. On any exception log warning + return None — MUST not raise.

**Done signal:** Unit test `test_embed_text_returns_1536_floats` passes with mocked client.

---

### TASK 1.5: Wire embeddings into `_load_associative` + `store_associative`

**File:** [api/app/core/memory_middleware.py](api/app/core/memory_middleware.py)

In `_load_associative`:
1. Check user tier via `TierManager.has_feature(user, "real_embeddings")`.
2. If True → `embed_text(message)` → if vector not None run:
   ```sql
   SELECT * FROM memory_associative
   WHERE user_id = :uid
   ORDER BY embedding <=> :qvec
   LIMIT :k;
   ```
   Use SQLAlchemy `embedding.cosine_distance(qvec)` (pgvector).
3. Fallback (False or None): keep current keyword-order path.

Add new `store_associative(user_id, content)` method:
- Encrypt content with user Fernet.
- If tier has real_embeddings → compute embedding, store alongside.
- Else → store with `embedding=NULL` (still useful for future upgrade).

**Done signal:** Associative search returns semantically-closer results on a pro test user, keyword-ordered for free user.

---

### TASK 1.6: Phase 1 checks

- `cd api && ruff check . --fix`
- `cd api && pytest -q tests/test_memory_middleware.py` (create minimal test if absent).
- Manual smoke: spin up docker compose, insert two associative memories via pro user, query → verify cosine ordering.

**Done signal:** All three green.

---

## PHASE 2 — Mem0-style Extract/Update pipeline

### TASK 2.1: Extraction prompt + schema

**File:** `api/app/core/memory_extraction.py` (new)

Define Pydantic models:
```python
class MemoryCandidate(BaseModel):
    type: Literal["fact", "preference", "relation", "routine"]
    content: str              # short canonical statement
    target_tier: Literal["core", "associative", "relational", "proactive"]
    subject: str | None = None     # only for relation
    predicate: str | None = None   # only for relation
    object: str | None = None      # only for relation
    confidence: float = 0.7

class ExtractionResult(BaseModel):
    candidates: list[MemoryCandidate]
```

Prompt template (system): "You are a memory extractor for a personal AI secretary. Given the last turn + core memory + recent episodes, identify durable facts, preferences, routines, and person/project relations. Output JSON matching the schema. Skip small talk. Max 5 candidates per turn."

Use `gpt-4o-mini`, `temperature=0`, `response_format={"type": "json_object"}`.

**Done signal:** Calling `extract_candidates(last_turn, core, recent)` on a fixture returns a valid `ExtractionResult`.

---

### TASK 2.2: Update decision (ADD / UPDATE / DELETE / NOOP)

**File:** `api/app/core/memory_extraction.py` (same file)

```python
async def decide_action(
    candidate: MemoryCandidate,
    existing: list[str],   # plaintext neighbours (top-3 by similarity in target tier)
) -> Literal["ADD", "UPDATE", "DELETE", "NOOP"]:
```

Uses a second `gpt-4o-mini` call with small prompt: "Given candidate and existing memories, decide ADD / UPDATE / DELETE / NOOP. Return only the verb."

Heuristic short-circuit: if `existing` empty → ADD without LLM (save cost).

**Done signal:** Unit tests for all 4 branches pass with mocked LLM.

---

### TASK 2.3: Pipeline orchestrator

**File:** `api/app/core/memory_extraction.py` (same file)

```python
async def run_extraction(
    db: AsyncSession,
    user_id: str,
    last_user_msg: str,
    last_assistant_msg: str,
    session_id: str | None,
) -> None:
```

Steps:
1. Load small context: `core_memory` + last 5 episodes (via middleware helpers).
2. `extract_candidates(...)`.
3. For each candidate: similarity-search target tier → top-3 neighbours → `decide_action` → apply via `MemoryMiddleware.update_core` / `store_associative` / (new) `upsert_relation` / `store_proactive`.
4. Log Langfuse trace with `trace_id`.
5. MUST not raise — wrap in try/except, log warning.

**Done signal:** Calling `run_extraction` on a fake "user said my CFO is Giulia" produces a relation candidate and a core candidate, and writes them.

---

### TASK 2.4: Tier-gated dispatch

**File:** [api/app/core/memory_middleware.py](api/app/core/memory_middleware.py)

After `store_episode` success, dispatch extraction:
- Pro / Power / Team → schedule realtime task (`asyncio.create_task(run_extraction(...))` — fire-and-forget, exceptions swallowed).
- Free → enqueue a daily-batch marker row (new table `extraction_queue(user_id, episode_id, created_at)`). A separate cron (Phase 5 stub OK) drains it.

Add `TierFeature.REALTIME_EXTRACTION` to tier_manager (Free=False).

**Done signal:** Pro user triggers realtime task (verified via log line); Free user gets queue row.

---

### TASK 2.5: Phase 2 checks

- `cd api && ruff check . --fix`
- `cd api && pytest -q tests/test_memory_extraction.py`

---

## PHASE 3 — `relational` tier (Mem0g-light)

### TASK 3.1: Alembic migration — `memory_relations` table

**File:** `api/alembic/versions/XXX_memory_relations.py` (new)

```sql
CREATE TABLE memory_relations (
  id UUID PRIMARY KEY,
  user_id UUID NOT NULL REFERENCES users(id),
  subject_label VARCHAR(128) NOT NULL,        -- canonical label (e.g. "Giulia")
  subject_type VARCHAR(32) NOT NULL,          -- 'person' | 'company' | 'project' | 'topic'
  predicate VARCHAR(64) NOT NULL,             -- 'works_at' | 'reports_to' | 'stakeholder_of' | 'last_contacted_on' | 'owes_followup' | custom
  object_label VARCHAR(128) NOT NULL,
  object_type VARCHAR(32) NOT NULL,
  confidence FLOAT NOT NULL DEFAULT 0.7,
  source_episode_id UUID NULL REFERENCES memory_episodic(id),
  notes_encrypted BYTEA NULL,                 -- Fernet, optional per-user commentary
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  last_confirmed_at TIMESTAMPTZ NULL          -- used by TTL decay
);
CREATE INDEX memory_relations_user_subject_idx ON memory_relations(user_id, subject_label);
CREATE INDEX memory_relations_user_predicate_idx ON memory_relations(user_id, predicate);
```

**Done signal:** `alembic upgrade head` clean.

---

### TASK 3.2: `MemoryRelation` ORM model

**File:** [api/app/models.py](api/app/models.py)

Mirror the table above. `subject_label` / `object_label` are **plaintext** (entity names — treated as identifiers, not content). `notes_encrypted` uses Fernet like other tiers.

**Done signal:** Import of `MemoryRelation` resolves.

---

### TASK 3.3: Relational middleware methods

**File:** [api/app/core/memory_middleware.py](api/app/core/memory_middleware.py)

Add:
- `async def upsert_relation(user_id, subject, subject_type, predicate, object_, object_type, *, confidence=0.7, source_episode_id=None, notes=None) -> None`
- `async def query_relations(user_id, subject=None, predicate=None, object_=None, limit=20) -> list[MemoryRelation]`
- Extend `enrich_context` return dict with key `relational_memory` — list of short strings `"{subject} --{predicate}--> {object}"` filtered by recent/confident (top 10).
- Tier-gate: Free tier → skip (empty list). Pro = base (person/project predicates only). Power = all predicates incl. custom. Use new `TierFeature.RELATIONAL_MEMORY`.

**Done signal:** Unit tests: upsert then query returns row; tier gating enforces limits.

---

### TASK 3.4: Orchestrator prompt injection

**File:** `api/app/core/deep_agent.py`

Where `core_memory` / `episodic` already injected into system prompt, add a new paragraph labelled **"Known people & projects:"** listing the `relational_memory` strings. Keep under 800 chars (truncate if longer).

**Done signal:** Running a turn with seeded relations — agent uses the info (verified via Langfuse trace + test).

---

### TASK 3.5: Hook into extraction pipeline

**File:** `api/app/core/memory_extraction.py`

When `candidate.type == "relation"` → call `upsert_relation(...)` instead of `update_core` / `store_associative`.

**Done signal:** End-to-end test: turn saying "Marco is the PM on Project Acme" produces a `person --stakeholder_of--> project` row.

---

### TASK 3.6: TTL + decay job

**File:** `api/app/core/memory_extraction.py` (or new `memory_maintenance.py`)

```python
async def decay_relations(db, user_id) -> None:
    # confidence *= 0.95 every 30 days since last_confirmed_at
    # delete rows with confidence < 0.2
```

Wire into the same daily batch cron as Free extraction (Phase 5 introduces scheduler — OK to define function now and call it from a stub).

**Done signal:** Function exists + has unit test on a seeded fixture.

---

### TASK 3.7: Phase 3 checks

- `cd api && ruff check . --fix`
- `cd api && pytest -q tests/test_memory_relations.py`

---

## PHASE 4 — Settings > Memory UI (Electron renderer)

### TASK 4.1: Backend endpoints for UI

**File:** `api/app/api/routes/auth.py` (memory sub-section) or new `api/app/api/routes/memory.py`

Routes (all `@require_auth`, return user-scoped data only):
- `GET /auth/me/memory/core` → `dict[str, str]` (plaintext, decrypted).
- `GET /auth/me/memory/relational` → `list[RelationOut]` (subject/pred/obj/confidence/last_confirmed_at).
- `PATCH /auth/me/memory/relational/{id}` → edit label/confidence; body validates predicate ∈ allowed set.
- `DELETE /auth/me/memory/relational/{id}` → hard delete (GDPR Art. 17).
- `DELETE /auth/me/memory/core/{key}` → remove a core k/v.
- `POST /auth/me/memory/forget-all` → wipe all 4 tiers for user; audit log entry. Requires `X-Confirm: true` header — reject 400 otherwise. Do NOT delete the User row.

**Done signal:** OpenAPI schema shows all 6 routes; pytest green.

---

### TASK 4.2: tRPC + auth-manager wrappers

**File:** [adiuvAI/src/main/auth/auth-manager.ts](adiuvAI/src/main/auth/auth-manager.ts) + [adiuvAI/src/main/router/index.ts](adiuvAI/src/main/router/index.ts)

Add auth-manager methods (6) wrapping each HTTP endpoint. Add tRPC procedures in a new `memoryRouter` merged into app router.

**Done signal:** `trpc.memory.listRelational.useQuery()` resolves from renderer.

---

### TASK 4.3: `MemorySection` settings page

**File:** `adiuvAI/src/renderer/components/settings/MemorySection.tsx` (new)

Sections in order:
1. **Core preferences** — table of k/v from `trpc.memory.getCore`. Each row: key, value, edit pencil (inline input), trash icon (`deleteCore`). Add-row form at bottom.
2. **People & relationships** — table of relations. Columns: subject, predicate (select), object, confidence (progress bar), last confirmed (formatted via `formatRow`). Pencil → edit in drawer. Trash → `deleteRelation`.
3. **Danger zone** — red Card with "Forget everything" button. Confirm dialog (typed "forget" to enable) → calls `forgetAll` with `X-Confirm: true`.

Wire into `SECTIONS` in [adiuvAI/src/renderer/components/settings/types.ts](adiuvAI/src/renderer/components/settings/types.ts) as `{ id: 'memory', label: 'Memory', icon: Brain }`. Use `Brain` from `lucide-react`.

**Free tier gating:** if `profile.tier === 'free'` → relational table hidden with upgrade CTA instead. Use `usePlatform()` + profile tier check.

**Done signal:** `/settings` → Memory tab renders all three sections, edits/deletes round-trip to backend.

---

### TASK 4.4: i18n keys

Add translation keys to all 5 JSON files under namespace `settings.memory.*`:
- `corePreferences`, `peopleRelationships`, `dangerZone`, `forgetEverything`, `forgetConfirm`, `addEntry`, `noEntries`, `upgradeToSeePeople`.

Keep `common.*` reuse for `save`/`cancel`/`delete`/`edit` (already present).

**Done signal:** All 5 locale files include the new keys.

---

### TASK 4.5: Phase 4 checks

- `cd adiuvAI && npx eslint . --fix`
- `cd adiuvAI && npx tsc --noEmit`
- Manual: run `npm run start`, log in, open Settings > Memory, edit a core key, verify persisted via `GET /auth/me` memory echo.

---

## PHASE 5 — Proactive mining (Power tier only)

### TASK 5.1: Scheduler skeleton

**File:** `api/app/core/memory_maintenance.py`

Two entrypoints, callable from a cron runner (APScheduler already a dep — if not, add):
- `drain_extraction_queue()` — processes `extraction_queue` rows (Phase 2.4) for Free tier users, batched.
- `mine_proactive_patterns(user_id)` — for Power tier users only. Reads last 30 days episodic, runs a single `gpt-4o-mini` call: "Identify recurring temporal/behavioral patterns". Writes results to `memory_proactive` with `confidence`. Applies decay (conf *= 0.9 per 7 days since last sighting).

Register jobs in `app/main.py` startup (only if `settings.SCHEDULER_ENABLED=True`, default True; false in tests).

**Done signal:** `pytest -q` green (scheduler disabled). Manual: setting `SCHEDULER_ENABLED=True` + dev run logs "memory cron tick" every 1h.

---

### TASK 5.2: Surfacing proactive hints

**File:** `api/app/core/deep_agent.py` + `adiuvAI/src/renderer/components/home/DailyBrief.tsx` (if exists)

Backend already injects `proactive_hints` into prompt (middleware). Confirm still works after changes; add unit test with seeded proactive row → assert string present in final system prompt.

On renderer, if daily brief component exists, show proactive hints as chips under "I noticed…" header. If not, skip — not a regression.

**Done signal:** System prompt includes proactive line when row exists + confidence ≥ threshold.

---

### TASK 5.3: Tier gate

Add `TierFeature.PROACTIVE_MINING` to tier_manager — Power + Team only.

**Done signal:** Free/Pro user → no cron row for them; Power user → mining runs.

---

### TASK 5.4: Phase 5 checks

- `cd api && ruff check . --fix`
- `cd api && pytest -q`

---

## PHASE 6 — Completion

### TASK 6.1: Verify all files exist / modified

New files:
- [ ] `api/alembic/versions/*_associative_pgvector.py`
- [ ] `api/alembic/versions/*_memory_relations.py`
- [ ] `api/app/core/embeddings.py`
- [ ] `api/app/core/memory_extraction.py`
- [ ] `api/app/core/memory_maintenance.py`
- [ ] `api/app/api/routes/memory.py` (or new routes appended in `auth.py`)
- [ ] `adiuvAI/src/renderer/components/settings/MemorySection.tsx`

Modified files:
- [ ] `api/app/models.py` (MemoryAssociative.embedding Vector(1536), MemoryRelation class)
- [ ] `api/app/core/memory_middleware.py` (real pgvector path, relational methods, enrich_context extended, dispatch extraction after store_episode)
- [ ] `api/app/billing/tier_manager.py` (REAL_EMBEDDINGS, REALTIME_EXTRACTION, RELATIONAL_MEMORY, PROACTIVE_MINING features)
- [ ] `api/app/core/deep_agent.py` (relational injection)
- [ ] `api/app/main.py` (scheduler startup)
- [ ] `api/requirements.txt` (pgvector, APScheduler)
- [ ] `adiuvAI/src/main/auth/auth-manager.ts` (6 memory methods)
- [ ] `adiuvAI/src/main/router/index.ts` (memoryRouter merged)
- [ ] `adiuvAI/src/renderer/components/settings/types.ts` (memory section entry)
- [ ] `adiuvAI/src/renderer/locales/{en,it,es,fr,de}/translation.json` (settings.memory.* keys)

### TASK 6.2: Full gauntlet

Run all four commands, expect exit 0:
```bash
cd api && ruff check . --fix
cd api && pytest -q
cd adiuvAI && npx eslint . --fix
cd adiuvAI && npx tsc --noEmit
```

### TASK 6.3: Output completion promise

If gauntlet green and file checklist complete:

```
<promise>MEMORY EVOLUTION COMPLETE</promise>
```

---

## DO NOT

- Skip the per-iteration caveman preamble — it is part of the contract of this loop.
- Break zero-trust: never log / return plaintext user content in error paths. Relation `subject_label`/`object_label` ARE treated as identifiers — log OK. `notes_encrypted` never logged.
- Introduce A-Mem-style retroactive memory rewrites. Explicitly out of scope (strategy doc §3.3).
- Introduce AutoGPT-style reflective loops. Out of scope.
- Store format prefs or device-specific UI data in core memory — that's electron-store territory (see PROMPT-onboarding.md for precedent).
- Use Neo4j or any external graph DB — plain Postgres table is the spec.
- Call OpenAI embeddings for Free-tier users.
- Ship proactive mining (Phase 5) before Phase 3 (relational) is green — order matters.
- Delete user rows in `forget-all` — only memory rows.
- Let extraction pipeline or LLM normalization raise into the request path — always try/except, log, swallow.

---

## REFERENCE — Existing patterns to reuse

| Pattern | Source | Reuse for |
|---------|--------|-----------|
| Fernet per-user enc/dec | [api/app/core/memory_middleware.py](api/app/core/memory_middleware.py) `_get_fernet`, `_safe_decrypt` | New relational `notes_encrypted`, extraction writes |
| LLM factory | [api/app/core/llm.py](api/app/core/llm.py) `get_llm` | Extraction + normalization + proactive mining |
| Tier check | `api/app/billing/tier_manager.py` `has_feature` | All tier gates in this plan |
| Alembic async URL split | [api/alembic/env.py](api/alembic/env.py) | New migrations |
| tRPC procedure + authManager wrap | [adiuvAI/src/main/router/index.ts](adiuvAI/src/main/router/index.ts), [auth-manager.ts](adiuvAI/src/main/auth/auth-manager.ts) | 6 memory routes |
| Settings section pattern | [adiuvAI/src/renderer/components/settings/ProfileSection.tsx](adiuvAI/src/renderer/components/settings/ProfileSection.tsx) | MemorySection shape |
| shadcn table + drawer + confirm | Existing Settings sections | Memory tables + forget confirm |
| i18n labelKey pattern | See CLAUDE.md i18n section | All new strings |

---

## CAVEMAN MODE REMINDER

This document's plan is executed **under caveman:caveman ultra**. Every iteration: activate the skill first, then work. Terse prose in all user-facing text emitted during the loop. Code + commit messages + migration SQL stay normal per caveman plugin boundaries.

If caveman plugin unavailable for any reason, STOP the iteration and report instead of proceeding in default mode — the loop contract requires it.