diff --git a/AI_REFACTOR_PLAN.md b/AI_REFACTOR_PLAN.md
index 66f09f4..12fe505 100644
--- a/AI_REFACTOR_PLAN.md
+++ b/AI_REFACTOR_PLAN.md
@@ -509,4 +509,116 @@ Cloud Agent:
 | `msal` | MS identity platform auth |
 | `apscheduler>=4.0` | Agent scheduling |
 | `cryptography` (Fernet) | OAuth token encryption at rest |
+
+---
+
+## Phase 5 — Shared Memory (Agent KV + Chat WS Fix)
+
+> **Objective:** Give chat agents persistent memory via a KV store on the Electron client. Agents can `store_memory()` to remember user preferences, patterns, and corrections, and `recall_memories()` to retrieve them. All data lives in Electron's SQLite `agent_memory` table (local-first, never stored server-side). This also requires fixing the chat WS handler to support bidirectional tool calls — currently a critical gap that blocks all agent tools from working over the `/chat/stream` endpoint.
+>
+> **Electron Phase 5 plan:** `../adiuva/AI_REFACTOR_PLAN.md` Phase 5 section.
+>
+> **Why agent KV matters:** Chat agents are currently stateless — they can't remember "User prefers to-do in lowercase" or "Client X billing cycle is the 15th". With KV memory, agents become learning assistants that improve over time. Users feel the AI "knows them" without any data leaving their device.
+>
+> **Why the chat WS fix is critical:** The existing `/chat/stream` WS handler (`app/api/routes/chat.py`) never calls `set_client_executor()`. This means `execute_on_client()` raises `RuntimeError` whenever any agent tool tries to call it during a chat session. All 23 tools are broken over chat WS. This must be fixed before memory tools (or any tools) can work.
+>
+> **New Electron tables** (managed by Electron, accessed by backend via `execute_on_client`):
+> - `chat_messages`: `id`, `scope`, `role`, `content`, `error`, `created_at`
+> - `agent_memory`: `id`, `agent_name`, `key`, `value`, `scope`, `created_at`, `updated_at` (unique on `agent_name, key, scope`)
+
+### Step 5.1 — Fix chat WS for bidirectional tool calls (PREREQUISITE)
+
+> **This is the highest-priority backend fix.** Without it, zero agent tools work over the chat WS connection.
+
+- [ ] Rewrite `app/api/routes/chat.py` — `chat_stream()` WS handler:
+  - After auth + accept, receive first frame as `{"type": "chat_request", ...}` (not raw `ChatRequest`)
+  - Parse frame, extract `message` and `context`
+  - Set up a local `pending_calls: dict[str, asyncio.Future]` for tool-call round-trips
+  - Define executor callback:
+    ```python
+    async def execute_callback(payload: dict) -> dict:
+        call_id = payload["id"]
+        fut = asyncio.get_event_loop().create_future()
+        pending_calls[call_id] = fut
+        await websocket.send_text(json.dumps({"type": "tool_call", **payload}))
+        return await asyncio.wait_for(fut, timeout=30.0)
+    ```
+  - Call `set_client_executor(execute_callback)` before orchestrating
+  - Run two concurrent tasks:
+    1. **Receive loop**: dispatches incoming frames — `tool_result` resolves pending Futures, `pong` ignored
+    2. **Orchestration task**: calls `orchestrate_stream()`, wraps chunks in `{"type": "text_chunk", "text": "..."}` frames, sends `{"type": "final", "response": "..."}` on completion
+  - Call `clear_client_executor()` in finally block
+  - Keep heartbeat ping every 30s
+  - 30s timeout on each `tool_result` — tool returns error string to LLM on timeout
+- [ ] Update `orchestrate_stream()` in `app/core/orchestrator.py` if needed:
+  - Ensure it properly yields text chunks (currently chunks by fixed 50-char slices — consider switching to yielding full response as single chunk for now)
+- **Files:** `app/api/routes/chat.py`, `app/core/orchestrator.py`
+- **Outcome:** Full bidirectional WS. Tool calls, text streaming, and heartbeats happen concurrently. All 23 existing agent tools now work over chat WS.
+
+### Step 5.2 — Agent memory tools
+
+- [ ] Create `app/agents/tools/memory_tools.py`:
+  - `create_memory_tools(agent_name: str) -> list[Tool]` — factory function that returns two LangChain `@tool` functions with `agent_name` bound via closure:
+    - **`store_memory(key: str, value: str, scope: str = "global")`**:
+      - Calls `execute_on_client(action="select", table="agentMemory", filters={"agentName": agent_name, "key": key, "scope": scope})`
+      - If row exists: `execute_on_client(action="update", table="agentMemory", data={"id": row["id"], "updates": {"value": value, "updatedAt": <now_ms>}})`
+      - If not: `execute_on_client(action="insert", table="agentMemory", data={"agentName": agent_name, "key": key, "value": value, "scope": scope})`
+      - Returns `"Stored memory: [key] = [value]"`
+    - **`recall_memories(key_pattern: str = None, scope: str = "global", limit: int = 10)`**:
+      - Calls `execute_on_client(action="select", table="agentMemory", filters={"agentName": agent_name, "scope": scope, "search": key_pattern})`
+      - Returns formatted list: `"key1: value1\nkey2: value2\n..."` or `"No memories found."`
+  - Timestamps are Unix milliseconds (consistent with Electron's `Date.now()`)
+  - Agent name scoping: each agent only sees its own memories (filtered by `agentName`)
+- **Files:** `app/agents/tools/memory_tools.py`
+- **Outcome:** Two reusable tools any agent can include. Upsert semantics via select-then-insert/update.
+
+### Step 5.3 — Register memory tools on all agents
+
+- [ ] Update `app/agents/task_agent.py`:
+  - Import `create_memory_tools` from `app/agents/tools/memory_tools`
+  - Add memory tools to `get_tools()`: `return [list_tasks, create_task, ..., *create_memory_tools("task_agent")]`
+  - Append to `_SYSTEM_PROMPT`: `"\n\nYou can store important facts about user preferences using store_memory and recall past facts using recall_memories. Store corrections, preferences, and patterns the user shares (e.g. 'User prefers short task titles', 'Default priority is medium'). Always check memories before giving advice."`
+- [ ] Update `app/agents/project_agent.py` — same pattern with `create_memory_tools("project_agent")`
+- [ ] Update `app/agents/note_agent.py` — same pattern with `create_memory_tools("note_agent")`
+- [ ] Update `app/agents/checkpoint_agent.py` — same pattern with `create_memory_tools("checkpoint_agent")`
+- **Files:** `app/agents/task_agent.py`, `app/agents/project_agent.py`, `app/agents/note_agent.py`, `app/agents/checkpoint_agent.py`
+- **Outcome:** All 4 chat agents can store and recall persistent memories. Each agent's memories are scoped by `agentName`.
+
+### Step 5.4 — Extend ChatContext with agent memories
+
+- [ ] Update `app/schemas.py`:
+  - Add `agent_memories: list[dict[str, Any]] = Field(default_factory=list)` to `ChatContext`
+  - These are pre-loaded by Electron (from `agent_memory` table) and included in every request
+- [ ] Agent `handle()` methods already receive full `context` dict — memories are visible in `context["agent_memories"]`
+- [ ] Agent system prompts reference memories from context: agents see pre-loaded memories AND can call `recall_memories` for targeted lookup
+- **Files:** `app/schemas.py`
+- **Outcome:** Backend receives pre-loaded memories from Electron. Agents have dual-path access: context injection (passive) + tool call (active).
+
+### Phase 5 — Verification
+
+| # | Scenario | Expected |
+|---|---|---|
+| 1 | **Chat WS bidirectional** | Connect → send `chat_request` → receive `tool_call` → respond `tool_result` → receive `text_chunk` → `final` |
+| 2 | **All existing tools work** | "List my tasks" over chat WS → `tool_call(select, tasks)` → Electron returns rows → LLM responds with real task data |
+| 3 | **Store memory** | "Remember that I prefer short task titles" → `store_memory("task_title_preference", "short")` → `tool_call(insert, agentMemory)` → Electron persists |
+| 4 | **Recall memory** | New chat session → "How should I name tasks?" → agent sees pre-loaded memory in context or calls `recall_memories` → references stored preference |
+| 5 | **Upsert semantics** | Store same key twice → only one row exists with updated value |
+| 6 | **Agent scope isolation** | `task_agent` stores memory → `note_agent` cannot see it (filtered by `agentName`) |
+| 7 | **Project scope** | Store memory with `scope="project:<uuid>"` → only visible in that project's chat context |
+| 8 | **Tool timeout** | Disconnect Electron mid-tool-call → 30s timeout → tool returns error → LLM handles gracefully |
+| 9 | **Concurrent tool calls** | Agent calls `list_tasks` then `recall_memories` in sequence → both WS round-trips succeed |
+| 10 | **Existing tests pass** | `pytest` — no regressions in agent tools or orchestrator |
+
+### Phase 5 — Step Dependencies
+
+```
+Step 5.1 (chat WS fix) ──────────────► Step 5.2 (memory tools) ──► Step 5.3 (register on agents)
+                                                                  ──► Step 5.4 (extend ChatContext)
+
+Step 5.1 is the BLOCKER — nothing else works until bidirectional tool calls are wired.
+Steps 5.3 and 5.4 can run in parallel after 5.2.
+```
+
+---
+
 - **One step at a time.** Mark `[x]` and commit with `step N.N complete: <outcome>`.
\ No newline at end of file