4.1 KiB
4.1 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
SimbaRAG is a RAG (Retrieval-Augmented Generation) conversational AI system for querying information about Simba (a cat). It ingests documents from Paperless-NGX, stores embeddings in ChromaDB, and uses LLMs (Ollama or OpenAI) to answer questions.
Commands
Development
# Start dev environment with hot reload
docker compose -f docker-compose.dev.yml up --build
# View logs
docker compose -f docker-compose.dev.yml logs -f raggr
Database Migrations (Aerich/Tortoise ORM)
# Generate migration (must run in Docker with DB access)
docker compose -f docker-compose.dev.yml exec raggr aerich migrate --name describe_change
# Apply migrations (auto-runs on startup, manual if needed)
docker compose -f docker-compose.dev.yml exec raggr aerich upgrade
# View migration history
docker compose exec raggr aerich history
Frontend
cd raggr-frontend
yarn install
yarn build # Production build
yarn dev # Dev server (rarely needed, backend serves frontend)
Production
docker compose build raggr
docker compose up -d
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Docker Compose │
├─────────────────────────────────────────────────────────────┤
│ raggr (port 8080) │ postgres (port 5432) │
│ ├── Quart backend │ PostgreSQL 16 │
│ ├── React frontend (served) │ │
│ └── ChromaDB (volume) │ │
└─────────────────────────────────────────────────────────────┘
Backend (root directory):
app.py- Quart application entry, serves API and static frontendmain.py- RAG logic, document indexing, LLM interaction, LangChain agentllm.py- LLM client with Ollama primary, OpenAI fallbackaerich_config.py- Database migration configurationblueprints/- API routes organized as Quart blueprintsusers/- OIDC auth, JWT tokens, RBAC with LDAP groupsconversation/- Chat conversations and message historyrag/- Document indexing endpoints (admin-only)
config/- Configuration modulesoidc_config.py- OIDC authentication configuration
utils/- Reusable utilitieschunker.py- Document chunking for embeddingscleaner.py- PDF cleaning and summarizationimage_process.py- Image description with LLMrequest.py- Paperless-NGX API client
scripts/- Administrative and utility scriptsadd_user.py- Create users manuallyuser_message_stats.py- User message statisticsmanage_vectorstore.py- Vector store management CLIinspect_vector_store.py- Inspect ChromaDB contentsquery.py- Query generation utilities
migrations/- Database migration files
Frontend (raggr-frontend/):
- React 19 with Rsbuild bundler
- Tailwind CSS for styling
- Built to
dist/, served by backend at/
Auth Flow: LLDAP → Authelia (OIDC) → Backend JWT → Frontend localStorage
Key Patterns
- All endpoints are async (
async def) - Use
@jwt_refresh_token_requiredfor authenticated endpoints - Use
@admin_requiredfor admin-only endpoints (checkslldap_admingroup) - Tortoise ORM models in
blueprints/*/models.py - Frontend API services in
raggr-frontend/src/api/
Environment Variables
See .env.example. Key ones:
DATABASE_URL- PostgreSQL connectionOIDC_*- Authelia OIDC configurationOLLAMA_URL- Local LLM serverOPENAI_API_KEY- Fallback LLMPAPERLESS_TOKEN/BASE_URL- Document source