Consolidate onto PostgreSQL by using pgvector instead of a separate
ChromaDB instance. This removes a Docker volume, a large dependency,
and simplifies the stack without meaningful performance impact at
our document scale.
- Swap langchain-chroma for langchain-postgres (PGVector)
- Use pgvector/pgvector:pg16 Docker image with init script
- Lazy-initialize vector store to avoid eager DB connections
- Add SQL helpers for stats/delete/list (replacing _collection access)
- Remove legacy main.py, chunker, petmd scraper, and /api/query endpoint
Re-index required after deploy (POST /api/rag/index + /index-obsidian).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Give the LangChain agent a save_user_memory tool so users can ask it to
remember preferences and personal facts. Memories are stored per-user in
a new user_memories table and injected into the system prompt on each
conversation turn.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update llm.py to use OpenAI client with custom base_url for llama-server
- Update agents.py to use ChatOpenAI instead of ChatOllama
- Remove unused ollama imports from main.py, chunker.py, query.py
- Add LLAMA_SERVER_URL and LLAMA_MODEL_NAME env vars
- Remove ollama and langchain-ollama dependencies
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>