Phases: 1. Foundation: Database models and IMAP utilities 2. Account Management: Admin UI for email configuration (ACCT-01 to ACCT-07) 3. Email Ingestion: Sync engine and retention cleanup (SYNC-01 to SYNC-09, RETN-01 to RETN-05) 4. Query Tools: LangChain email analytics (QUERY-01 to QUERY-06) All v1 requirements mapped to phases.
4.1 KiB
Roadmap: SimbaRAG Email Integration
Overview
Add IMAP email ingestion to SimbaRAG's existing document/finance/meal analytics capabilities. Admin users can configure email accounts, system syncs and embeds emails into ChromaDB on a schedule, automatically purges emails older than 30 days, and provides LangChain tools for inbox analytics through natural conversation.
Phases
Phase Numbering:
- Integer phases (1, 2, 3): Planned milestone work
- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
Decimal phases appear between their surrounding integers in numeric order.
- Phase 1: Foundation - Database models and IMAP utilities
- Phase 2: Account Management - Admin UI for configuring email accounts
- Phase 3: Email Ingestion - Sync engine, embeddings, retention cleanup
- Phase 4: Query Tools - LangChain tools for email analytics
Phase Details
Phase 1: Foundation
Goal: Core infrastructure for email ingestion is in place Depends on: Nothing (first phase) Requirements: None (foundational infrastructure) Success Criteria (what must be TRUE):
- Database tables exist for email accounts, sync status, and email metadata
- IMAP connection utility can authenticate and list folders from test server
- Email body parser extracts text from both plain text and HTML formats
- Encryption utility securely stores and retrieves IMAP credentials Plans: TBD
Plans:
- 01-01: TBD
Phase 2: Account Management
Goal: Admin users can configure and manage IMAP email accounts Depends on: Phase 1 Requirements: ACCT-01, ACCT-02, ACCT-03, ACCT-04, ACCT-05, ACCT-06, ACCT-07 Success Criteria (what must be TRUE):
- Admin can add new IMAP account with host, port, username, password, folder selection
- Admin can test IMAP connection and see success/failure before saving
- Admin can view list of configured accounts with masked credentials
- Admin can edit existing account configuration and delete accounts
- Only users in lldap_admin group can access email account endpoints Plans: TBD
Plans:
- 02-01: TBD
Phase 3: Email Ingestion
Goal: System automatically syncs emails, creates embeddings, and purges old content Depends on: Phase 2 Requirements: SYNC-01, SYNC-02, SYNC-03, SYNC-04, SYNC-05, SYNC-06, SYNC-07, SYNC-08, SYNC-09, RETN-01, RETN-02, RETN-03, RETN-04, RETN-05 Success Criteria (what must be TRUE):
- System connects to configured IMAP accounts and fetches messages from selected folders
- System parses email metadata (subject, sender, date) and extracts body text from plain/HTML
- System generates embeddings and stores emails in ChromaDB with metadata
- System performs scheduled sync at configurable intervals (default hourly)
- System tracks last sync timestamp and performs incremental sync (only new emails)
- System automatically purges emails older than retention period (default 30 days)
- Admin can view sync logs showing success/failure, counts, and errors Plans: TBD
Plans:
- 03-01: TBD
Phase 4: Query Tools
Goal: Admin users can query email content through conversational interface Depends on: Phase 3 Requirements: QUERY-01, QUERY-02, QUERY-03, QUERY-04, QUERY-05, QUERY-06 Success Criteria (what must be TRUE):
- LangChain agent has tool to search emails by content, sender, or date range
- Agent can identify most frequent senders in a timeframe
- Agent can analyze subject lines and identify common topics
- Agent can detect subscription/newsletter patterns (recurring senders, unsubscribe links)
- Agent can answer time-based queries ("emails this week", "emails in January")
- Only admin users can query email content via conversation interface Plans: TBD
Plans:
- 04-01: TBD
Progress
Execution Order: Phases execute in numeric order: 1 → 2 → 3 → 4
| Phase | Plans Complete | Status | Completed |
|---|---|---|---|
| 1. Foundation | 0/1 | Not started | - |
| 2. Account Management | 0/1 | Not started | - |
| 3. Email Ingestion | 0/1 | Not started | - |
| 4. Query Tools | 0/1 | Not started | - |