Replace ChromaDB with pgvector #28

Merged
ryan merged 1 commits from refactor/chromadb-to-pgvector into main 2026-04-24 08:44:40 -04:00
Owner

Summary

  • Swap ChromaDB for pgvector, consolidating all storage onto PostgreSQL
  • Remove ChromaDB Docker volume, env vars, and ~1400 lines of legacy code
  • Lazy-init PGVector to avoid eager DB connections at import time

After merge

Re-index documents:

  1. POST /api/rag/index
  2. POST /api/rag/index-obsidian
  3. docker volume rm the old chromadb_data volume

Test plan

  • All 71 unit tests pass
  • docker compose up --build starts cleanly
  • Index + query cycle works end-to-end
  • Conversation with simba_search returns results
## Summary - Swap ChromaDB for pgvector, consolidating all storage onto PostgreSQL - Remove ChromaDB Docker volume, env vars, and ~1400 lines of legacy code - Lazy-init PGVector to avoid eager DB connections at import time ## After merge Re-index documents: 1. `POST /api/rag/index` 2. `POST /api/rag/index-obsidian` 3. `docker volume rm` the old `chromadb_data` volume ## Test plan - [x] All 71 unit tests pass - [ ] `docker compose up --build` starts cleanly - [ ] Index + query cycle works end-to-end - [ ] Conversation with simba_search returns results
ryan added 1 commit 2026-04-24 08:44:07 -04:00
Consolidate onto PostgreSQL by using pgvector instead of a separate
ChromaDB instance. This removes a Docker volume, a large dependency,
and simplifies the stack without meaningful performance impact at
our document scale.

- Swap langchain-chroma for langchain-postgres (PGVector)
- Use pgvector/pgvector:pg16 Docker image with init script
- Lazy-initialize vector store to avoid eager DB connections
- Add SQL helpers for stats/delete/list (replacing _collection access)
- Remove legacy main.py, chunker, petmd scraper, and /api/query endpoint

Re-index required after deploy (POST /api/rag/index + /index-obsidian).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ryan merged commit 3b8fa3e7a0 into main 2026-04-24 08:44:40 -04:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: ryan/simbarag#28