Files
simbarag/.planning/ROADMAP.md
Ryan Chen 38d7292df7 docs: create roadmap (4 phases)
Phases:
1. Foundation: Database models and IMAP utilities
2. Account Management: Admin UI for email configuration (ACCT-01 to ACCT-07)
3. Email Ingestion: Sync engine and retention cleanup (SYNC-01 to SYNC-09, RETN-01 to RETN-05)
4. Query Tools: LangChain email analytics (QUERY-01 to QUERY-06)

All v1 requirements mapped to phases.
2026-02-07 13:18:57 -05:00

4.1 KiB

Roadmap: SimbaRAG Email Integration

Overview

Add IMAP email ingestion to SimbaRAG's existing document/finance/meal analytics capabilities. Admin users can configure email accounts, system syncs and embeds emails into ChromaDB on a schedule, automatically purges emails older than 30 days, and provides LangChain tools for inbox analytics through natural conversation.

Phases

Phase Numbering:

  • Integer phases (1, 2, 3): Planned milestone work
  • Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)

Decimal phases appear between their surrounding integers in numeric order.

  • Phase 1: Foundation - Database models and IMAP utilities
  • Phase 2: Account Management - Admin UI for configuring email accounts
  • Phase 3: Email Ingestion - Sync engine, embeddings, retention cleanup
  • Phase 4: Query Tools - LangChain tools for email analytics

Phase Details

Phase 1: Foundation

Goal: Core infrastructure for email ingestion is in place Depends on: Nothing (first phase) Requirements: None (foundational infrastructure) Success Criteria (what must be TRUE):

  1. Database tables exist for email accounts, sync status, and email metadata
  2. IMAP connection utility can authenticate and list folders from test server
  3. Email body parser extracts text from both plain text and HTML formats
  4. Encryption utility securely stores and retrieves IMAP credentials Plans: TBD

Plans:

  • 01-01: TBD

Phase 2: Account Management

Goal: Admin users can configure and manage IMAP email accounts Depends on: Phase 1 Requirements: ACCT-01, ACCT-02, ACCT-03, ACCT-04, ACCT-05, ACCT-06, ACCT-07 Success Criteria (what must be TRUE):

  1. Admin can add new IMAP account with host, port, username, password, folder selection
  2. Admin can test IMAP connection and see success/failure before saving
  3. Admin can view list of configured accounts with masked credentials
  4. Admin can edit existing account configuration and delete accounts
  5. Only users in lldap_admin group can access email account endpoints Plans: TBD

Plans:

  • 02-01: TBD

Phase 3: Email Ingestion

Goal: System automatically syncs emails, creates embeddings, and purges old content Depends on: Phase 2 Requirements: SYNC-01, SYNC-02, SYNC-03, SYNC-04, SYNC-05, SYNC-06, SYNC-07, SYNC-08, SYNC-09, RETN-01, RETN-02, RETN-03, RETN-04, RETN-05 Success Criteria (what must be TRUE):

  1. System connects to configured IMAP accounts and fetches messages from selected folders
  2. System parses email metadata (subject, sender, date) and extracts body text from plain/HTML
  3. System generates embeddings and stores emails in ChromaDB with metadata
  4. System performs scheduled sync at configurable intervals (default hourly)
  5. System tracks last sync timestamp and performs incremental sync (only new emails)
  6. System automatically purges emails older than retention period (default 30 days)
  7. Admin can view sync logs showing success/failure, counts, and errors Plans: TBD

Plans:

  • 03-01: TBD

Phase 4: Query Tools

Goal: Admin users can query email content through conversational interface Depends on: Phase 3 Requirements: QUERY-01, QUERY-02, QUERY-03, QUERY-04, QUERY-05, QUERY-06 Success Criteria (what must be TRUE):

  1. LangChain agent has tool to search emails by content, sender, or date range
  2. Agent can identify most frequent senders in a timeframe
  3. Agent can analyze subject lines and identify common topics
  4. Agent can detect subscription/newsletter patterns (recurring senders, unsubscribe links)
  5. Agent can answer time-based queries ("emails this week", "emails in January")
  6. Only admin users can query email content via conversation interface Plans: TBD

Plans:

  • 04-01: TBD

Progress

Execution Order: Phases execute in numeric order: 1 → 2 → 3 → 4

Phase Plans Complete Status Completed
1. Foundation 0/1 Not started -
2. Account Management 0/1 Not started -
3. Email Ingestion 0/1 Not started -
4. Query Tools 0/1 Not started -