From 38d7292df7ce7485203915ce5cf8ce9225ead0cf Mon Sep 17 00:00:00 2001 From: Ryan Chen Date: Sat, 7 Feb 2026 13:18:57 -0500 Subject: [PATCH] docs: create roadmap (4 phases) Phases: 1. Foundation: Database models and IMAP utilities 2. Account Management: Admin UI for email configuration (ACCT-01 to ACCT-07) 3. Email Ingestion: Sync engine and retention cleanup (SYNC-01 to SYNC-09, RETN-01 to RETN-05) 4. Query Tools: LangChain email analytics (QUERY-01 to QUERY-06) All v1 requirements mapped to phases. --- .planning/REQUIREMENTS.md | 34 ++++++++++++-- .planning/ROADMAP.md | 94 +++++++++++++++++++++++++++++++++++++++ .planning/STATE.md | 64 ++++++++++++++++++++++++++ 3 files changed, 188 insertions(+), 4 deletions(-) create mode 100644 .planning/ROADMAP.md create mode 100644 .planning/STATE.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index b3dafa1..194c984 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -82,13 +82,39 @@ Which phases cover which requirements. Updated during roadmap creation. | Requirement | Phase | Status | |-------------|-------|--------| -| (To be populated by roadmap) | | | +| ACCT-01 | Phase 2 | Pending | +| ACCT-02 | Phase 2 | Pending | +| ACCT-03 | Phase 2 | Pending | +| ACCT-04 | Phase 2 | Pending | +| ACCT-05 | Phase 2 | Pending | +| ACCT-06 | Phase 2 | Pending | +| ACCT-07 | Phase 2 | Pending | +| SYNC-01 | Phase 3 | Pending | +| SYNC-02 | Phase 3 | Pending | +| SYNC-03 | Phase 3 | Pending | +| SYNC-04 | Phase 3 | Pending | +| SYNC-05 | Phase 3 | Pending | +| SYNC-06 | Phase 3 | Pending | +| SYNC-07 | Phase 3 | Pending | +| SYNC-08 | Phase 3 | Pending | +| SYNC-09 | Phase 3 | Pending | +| RETN-01 | Phase 3 | Pending | +| RETN-02 | Phase 3 | Pending | +| RETN-03 | Phase 3 | Pending | +| RETN-04 | Phase 3 | Pending | +| RETN-05 | Phase 3 | Pending | +| QUERY-01 | Phase 4 | Pending | +| QUERY-02 | Phase 4 | Pending | +| QUERY-03 | Phase 4 | Pending | +| QUERY-04 | Phase 4 | Pending | +| QUERY-05 | Phase 4 | Pending | +| QUERY-06 | Phase 4 | Pending | **Coverage:** - v1 requirements: 25 total -- Mapped to phases: 0 -- Unmapped: 25 ⚠️ +- Mapped to phases: 25 +- Unmapped: 0 --- *Requirements defined: 2026-02-04* -*Last updated: 2026-02-04 after initial definition* +*Last updated: 2026-02-07 after roadmap creation* diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md new file mode 100644 index 0000000..aba3a15 --- /dev/null +++ b/.planning/ROADMAP.md @@ -0,0 +1,94 @@ +# Roadmap: SimbaRAG Email Integration + +## Overview + +Add IMAP email ingestion to SimbaRAG's existing document/finance/meal analytics capabilities. Admin users can configure email accounts, system syncs and embeds emails into ChromaDB on a schedule, automatically purges emails older than 30 days, and provides LangChain tools for inbox analytics through natural conversation. + +## Phases + +**Phase Numbering:** +- Integer phases (1, 2, 3): Planned milestone work +- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED) + +Decimal phases appear between their surrounding integers in numeric order. + +- [ ] **Phase 1: Foundation** - Database models and IMAP utilities +- [ ] **Phase 2: Account Management** - Admin UI for configuring email accounts +- [ ] **Phase 3: Email Ingestion** - Sync engine, embeddings, retention cleanup +- [ ] **Phase 4: Query Tools** - LangChain tools for email analytics + +## Phase Details + +### Phase 1: Foundation +**Goal**: Core infrastructure for email ingestion is in place +**Depends on**: Nothing (first phase) +**Requirements**: None (foundational infrastructure) +**Success Criteria** (what must be TRUE): + 1. Database tables exist for email accounts, sync status, and email metadata + 2. IMAP connection utility can authenticate and list folders from test server + 3. Email body parser extracts text from both plain text and HTML formats + 4. Encryption utility securely stores and retrieves IMAP credentials +**Plans**: TBD + +Plans: +- [ ] 01-01: TBD + +### Phase 2: Account Management +**Goal**: Admin users can configure and manage IMAP email accounts +**Depends on**: Phase 1 +**Requirements**: ACCT-01, ACCT-02, ACCT-03, ACCT-04, ACCT-05, ACCT-06, ACCT-07 +**Success Criteria** (what must be TRUE): + 1. Admin can add new IMAP account with host, port, username, password, folder selection + 2. Admin can test IMAP connection and see success/failure before saving + 3. Admin can view list of configured accounts with masked credentials + 4. Admin can edit existing account configuration and delete accounts + 5. Only users in lldap_admin group can access email account endpoints +**Plans**: TBD + +Plans: +- [ ] 02-01: TBD + +### Phase 3: Email Ingestion +**Goal**: System automatically syncs emails, creates embeddings, and purges old content +**Depends on**: Phase 2 +**Requirements**: SYNC-01, SYNC-02, SYNC-03, SYNC-04, SYNC-05, SYNC-06, SYNC-07, SYNC-08, SYNC-09, RETN-01, RETN-02, RETN-03, RETN-04, RETN-05 +**Success Criteria** (what must be TRUE): + 1. System connects to configured IMAP accounts and fetches messages from selected folders + 2. System parses email metadata (subject, sender, date) and extracts body text from plain/HTML + 3. System generates embeddings and stores emails in ChromaDB with metadata + 4. System performs scheduled sync at configurable intervals (default hourly) + 5. System tracks last sync timestamp and performs incremental sync (only new emails) + 6. System automatically purges emails older than retention period (default 30 days) + 7. Admin can view sync logs showing success/failure, counts, and errors +**Plans**: TBD + +Plans: +- [ ] 03-01: TBD + +### Phase 4: Query Tools +**Goal**: Admin users can query email content through conversational interface +**Depends on**: Phase 3 +**Requirements**: QUERY-01, QUERY-02, QUERY-03, QUERY-04, QUERY-05, QUERY-06 +**Success Criteria** (what must be TRUE): + 1. LangChain agent has tool to search emails by content, sender, or date range + 2. Agent can identify most frequent senders in a timeframe + 3. Agent can analyze subject lines and identify common topics + 4. Agent can detect subscription/newsletter patterns (recurring senders, unsubscribe links) + 5. Agent can answer time-based queries ("emails this week", "emails in January") + 6. Only admin users can query email content via conversation interface +**Plans**: TBD + +Plans: +- [ ] 04-01: TBD + +## Progress + +**Execution Order:** +Phases execute in numeric order: 1 → 2 → 3 → 4 + +| Phase | Plans Complete | Status | Completed | +|-------|----------------|--------|-----------| +| 1. Foundation | 0/1 | Not started | - | +| 2. Account Management | 0/1 | Not started | - | +| 3. Email Ingestion | 0/1 | Not started | - | +| 4. Query Tools | 0/1 | Not started | - | diff --git a/.planning/STATE.md b/.planning/STATE.md new file mode 100644 index 0000000..489f5e4 --- /dev/null +++ b/.planning/STATE.md @@ -0,0 +1,64 @@ +# Project State + +## Project Reference + +See: .planning/PROJECT.md (updated 2026-02-04) + +**Core value:** Personal information retrieval through natural conversation - ask about any aspect of your documented life (papers, finances, meals, emails) and get accurate, context-aware answers. +**Current focus:** Phase 1 - Foundation + +## Current Position + +Phase: 1 of 4 (Foundation) +Plan: Ready to plan +Status: Ready to plan +Last activity: 2026-02-07 — Roadmap created + +Progress: [░░░░░░░░░░] 0% + +## Performance Metrics + +**Velocity:** +- Total plans completed: 0 +- Average duration: N/A +- Total execution time: 0 hours + +**By Phase:** + +| Phase | Plans | Total | Avg/Plan | +|-------|-------|-------|----------| +| - | - | - | - | + +**Recent Trend:** +- Last 5 plans: N/A +- Trend: N/A + +*Updated after each plan completion* + +## Accumulated Context + +### Decisions + +Decisions are logged in PROJECT.md Key Decisions table. +Recent decisions affecting current work: + +- IMAP only (no SMTP): User wants inbox analytics, not sending capabilities +- Admin-only access: Email is privacy-sensitive, limit to trusted admins +- 30-day retention: Balance utility with privacy/storage concerns +- Scheduled sync: Reduces server load vs real-time polling +- No attachment indexing: Complexity vs value, focus on text content first +- ChromaDB for emails: Reuse existing vector store, no new infrastructure + +### Pending Todos + +None yet. + +### Blockers/Concerns + +None yet. + +## Session Continuity + +Last session: 2026-02-07 +Stopped at: Roadmap creation complete +Resume file: None