docs: create roadmap (4 phases)

Phases:
1. Foundation: Database models and IMAP utilities
2. Account Management: Admin UI for email configuration (ACCT-01 to ACCT-07)
3. Email Ingestion: Sync engine and retention cleanup (SYNC-01 to SYNC-09, RETN-01 to RETN-05)
4. Query Tools: LangChain email analytics (QUERY-01 to QUERY-06)

All v1 requirements mapped to phases.
This commit is contained in:
2026-02-07 13:18:57 -05:00
parent 8a8617887a
commit 38d7292df7
3 changed files with 188 additions and 4 deletions

View File

@@ -82,13 +82,39 @@ Which phases cover which requirements. Updated during roadmap creation.
| Requirement | Phase | Status | | Requirement | Phase | Status |
|-------------|-------|--------| |-------------|-------|--------|
| (To be populated by roadmap) | | | | ACCT-01 | Phase 2 | Pending |
| ACCT-02 | Phase 2 | Pending |
| ACCT-03 | Phase 2 | Pending |
| ACCT-04 | Phase 2 | Pending |
| ACCT-05 | Phase 2 | Pending |
| ACCT-06 | Phase 2 | Pending |
| ACCT-07 | Phase 2 | Pending |
| SYNC-01 | Phase 3 | Pending |
| SYNC-02 | Phase 3 | Pending |
| SYNC-03 | Phase 3 | Pending |
| SYNC-04 | Phase 3 | Pending |
| SYNC-05 | Phase 3 | Pending |
| SYNC-06 | Phase 3 | Pending |
| SYNC-07 | Phase 3 | Pending |
| SYNC-08 | Phase 3 | Pending |
| SYNC-09 | Phase 3 | Pending |
| RETN-01 | Phase 3 | Pending |
| RETN-02 | Phase 3 | Pending |
| RETN-03 | Phase 3 | Pending |
| RETN-04 | Phase 3 | Pending |
| RETN-05 | Phase 3 | Pending |
| QUERY-01 | Phase 4 | Pending |
| QUERY-02 | Phase 4 | Pending |
| QUERY-03 | Phase 4 | Pending |
| QUERY-04 | Phase 4 | Pending |
| QUERY-05 | Phase 4 | Pending |
| QUERY-06 | Phase 4 | Pending |
**Coverage:** **Coverage:**
- v1 requirements: 25 total - v1 requirements: 25 total
- Mapped to phases: 0 - Mapped to phases: 25
- Unmapped: 25 ⚠️ - Unmapped: 0
--- ---
*Requirements defined: 2026-02-04* *Requirements defined: 2026-02-04*
*Last updated: 2026-02-04 after initial definition* *Last updated: 2026-02-07 after roadmap creation*

94
.planning/ROADMAP.md Normal file
View File

@@ -0,0 +1,94 @@
# Roadmap: SimbaRAG Email Integration
## Overview
Add IMAP email ingestion to SimbaRAG's existing document/finance/meal analytics capabilities. Admin users can configure email accounts, system syncs and embeds emails into ChromaDB on a schedule, automatically purges emails older than 30 days, and provides LangChain tools for inbox analytics through natural conversation.
## Phases
**Phase Numbering:**
- Integer phases (1, 2, 3): Planned milestone work
- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
Decimal phases appear between their surrounding integers in numeric order.
- [ ] **Phase 1: Foundation** - Database models and IMAP utilities
- [ ] **Phase 2: Account Management** - Admin UI for configuring email accounts
- [ ] **Phase 3: Email Ingestion** - Sync engine, embeddings, retention cleanup
- [ ] **Phase 4: Query Tools** - LangChain tools for email analytics
## Phase Details
### Phase 1: Foundation
**Goal**: Core infrastructure for email ingestion is in place
**Depends on**: Nothing (first phase)
**Requirements**: None (foundational infrastructure)
**Success Criteria** (what must be TRUE):
1. Database tables exist for email accounts, sync status, and email metadata
2. IMAP connection utility can authenticate and list folders from test server
3. Email body parser extracts text from both plain text and HTML formats
4. Encryption utility securely stores and retrieves IMAP credentials
**Plans**: TBD
Plans:
- [ ] 01-01: TBD
### Phase 2: Account Management
**Goal**: Admin users can configure and manage IMAP email accounts
**Depends on**: Phase 1
**Requirements**: ACCT-01, ACCT-02, ACCT-03, ACCT-04, ACCT-05, ACCT-06, ACCT-07
**Success Criteria** (what must be TRUE):
1. Admin can add new IMAP account with host, port, username, password, folder selection
2. Admin can test IMAP connection and see success/failure before saving
3. Admin can view list of configured accounts with masked credentials
4. Admin can edit existing account configuration and delete accounts
5. Only users in lldap_admin group can access email account endpoints
**Plans**: TBD
Plans:
- [ ] 02-01: TBD
### Phase 3: Email Ingestion
**Goal**: System automatically syncs emails, creates embeddings, and purges old content
**Depends on**: Phase 2
**Requirements**: SYNC-01, SYNC-02, SYNC-03, SYNC-04, SYNC-05, SYNC-06, SYNC-07, SYNC-08, SYNC-09, RETN-01, RETN-02, RETN-03, RETN-04, RETN-05
**Success Criteria** (what must be TRUE):
1. System connects to configured IMAP accounts and fetches messages from selected folders
2. System parses email metadata (subject, sender, date) and extracts body text from plain/HTML
3. System generates embeddings and stores emails in ChromaDB with metadata
4. System performs scheduled sync at configurable intervals (default hourly)
5. System tracks last sync timestamp and performs incremental sync (only new emails)
6. System automatically purges emails older than retention period (default 30 days)
7. Admin can view sync logs showing success/failure, counts, and errors
**Plans**: TBD
Plans:
- [ ] 03-01: TBD
### Phase 4: Query Tools
**Goal**: Admin users can query email content through conversational interface
**Depends on**: Phase 3
**Requirements**: QUERY-01, QUERY-02, QUERY-03, QUERY-04, QUERY-05, QUERY-06
**Success Criteria** (what must be TRUE):
1. LangChain agent has tool to search emails by content, sender, or date range
2. Agent can identify most frequent senders in a timeframe
3. Agent can analyze subject lines and identify common topics
4. Agent can detect subscription/newsletter patterns (recurring senders, unsubscribe links)
5. Agent can answer time-based queries ("emails this week", "emails in January")
6. Only admin users can query email content via conversation interface
**Plans**: TBD
Plans:
- [ ] 04-01: TBD
## Progress
**Execution Order:**
Phases execute in numeric order: 1 → 2 → 3 → 4
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. Foundation | 0/1 | Not started | - |
| 2. Account Management | 0/1 | Not started | - |
| 3. Email Ingestion | 0/1 | Not started | - |
| 4. Query Tools | 0/1 | Not started | - |

64
.planning/STATE.md Normal file
View File

@@ -0,0 +1,64 @@
# Project State
## Project Reference
See: .planning/PROJECT.md (updated 2026-02-04)
**Core value:** Personal information retrieval through natural conversation - ask about any aspect of your documented life (papers, finances, meals, emails) and get accurate, context-aware answers.
**Current focus:** Phase 1 - Foundation
## Current Position
Phase: 1 of 4 (Foundation)
Plan: Ready to plan
Status: Ready to plan
Last activity: 2026-02-07 — Roadmap created
Progress: [░░░░░░░░░░] 0%
## Performance Metrics
**Velocity:**
- Total plans completed: 0
- Average duration: N/A
- Total execution time: 0 hours
**By Phase:**
| Phase | Plans | Total | Avg/Plan |
|-------|-------|-------|----------|
| - | - | - | - |
**Recent Trend:**
- Last 5 plans: N/A
- Trend: N/A
*Updated after each plan completion*
## Accumulated Context
### Decisions
Decisions are logged in PROJECT.md Key Decisions table.
Recent decisions affecting current work:
- IMAP only (no SMTP): User wants inbox analytics, not sending capabilities
- Admin-only access: Email is privacy-sensitive, limit to trusted admins
- 30-day retention: Balance utility with privacy/storage concerns
- Scheduled sync: Reduces server load vs real-time polling
- No attachment indexing: Complexity vs value, focus on text content first
- ChromaDB for emails: Reuse existing vector store, no new infrastructure
### Pending Todos
None yet.
### Blockers/Concerns
None yet.
## Session Continuity
Last session: 2026-02-07
Stopped at: Roadmap creation complete
Resume file: None