Files

2026-01-31 17:13:27 -05:00

6.3 KiB

Raw Blame History

SimbaRAG Documentation

Welcome to the SimbaRAG documentation! This guide will help you understand, develop, and deploy the SimbaRAG conversational AI system.

Getting Started

New to SimbaRAG? Start here:

Read the main README for project overview and architecture
Follow the Development Guide to set up your environment
Learn about Authentication setup with OIDC and LDAP

Documentation Structure

Core Guides

Development Guide - Local development setup, project structure, and workflows
Deployment Guide - Database migrations, deployment workflows, and troubleshooting
Vector Store Guide - Managing ChromaDB, indexing documents, and RAG operations
Migrations Guide - Database migration reference
Authentication Guide - OIDC, Authelia, LLDAP configuration and user management

Quick Reference

Task	Documentation
Set up local dev environment	Development Guide → Quick Start
Run database migrations	Deployment Guide → Migration Workflow
Index documents	Vector Store Guide → Management Commands
Configure authentication	Authentication Guide
Run administrative scripts	Development Guide → Scripts

Common Tasks

Development

# Start local development
docker compose -f docker-compose.dev.yml up -d
export DATABASE_URL="postgres://raggr:raggr_dev_password@localhost:5432/raggr"
export CHROMADB_PATH="./chromadb"
python app.py

Database Migrations

# Generate migration
aerich migrate --name "your_change"

# Apply migrations
aerich upgrade

# View history
aerich history

Vector Store Management

# Show statistics
python scripts/manage_vectorstore.py stats

# Index new documents
python scripts/manage_vectorstore.py index

# Reindex everything
python scripts/manage_vectorstore.py reindex

Architecture Overview

SimbaRAG is built with:

Backend: Quart (async Python), LangChain, Tortoise ORM
Frontend: React 19, Rsbuild, Tailwind CSS
Database: PostgreSQL (users, conversations)
Vector Store: ChromaDB (document embeddings)
LLM: Ollama (primary), OpenAI (fallback)
Auth: Authelia (OIDC), LLDAP (user directory)

See the README for detailed architecture diagram.

Project Structure

simbarag/
├── app.py                  # Quart app entry point
├── main.py                 # RAG & LangChain agent
├── llm.py                  # LLM client
├── blueprints/             # API routes
├── config/                 # Configuration
├── utils/                  # Utilities
├── scripts/                # Admin scripts
├── raggr-frontend/         # React UI
├── migrations/             # Database migrations
├── docs/                   # This documentation
├── docker-compose.yml      # Production Docker setup
└── docker-compose.dev.yml  # Development Docker setup

Key Concepts

RAG (Retrieval-Augmented Generation)

SimbaRAG uses RAG to answer questions about Simba:

Documents are fetched from Paperless-NGX
Documents are chunked and embedded using OpenAI
Embeddings are stored in ChromaDB
User queries are embedded and matched against the store
Relevant chunks are passed to the LLM for context
LLM generates an answer using retrieved context

LangChain Agent

The conversational agent has two tools:

simba_search: Queries the vector store for Simba's documents
web_search: Searches the web via Tavily API

The agent automatically selects tools based on the query.

Authentication Flow

User initiates OIDC login via Authelia
Authelia authenticates against LLDAP
Backend receives OIDC tokens and issues JWT
Frontend stores JWT in localStorage
Subsequent requests use JWT for authorization

Environment Variables

Key environment variables (see .env.example for complete list):

Variable	Purpose
`DATABASE_URL`	PostgreSQL connection
`CHROMADB_PATH`	Vector store location
`OLLAMA_URL`	Local LLM server
`OPENAI_API_KEY`	OpenAI for embeddings/fallback
`PAPERLESS_TOKEN`	Document source API
`OIDC_*`	Authentication configuration
`TAVILY_KEY`	Web search API

API Endpoints

Authentication

GET /api/user/oidc/login - Start OIDC flow
GET /api/user/oidc/callback - OIDC callback
POST /api/user/refresh - Refresh JWT

Conversations

POST /api/conversation/ - Create conversation
GET /api/conversation/ - List conversations
POST /api/conversation/query - Chat message

RAG (Admin Only)

GET /api/rag/stats - Vector store stats
POST /api/rag/index - Index documents
POST /api/rag/reindex - Reindex all

Troubleshooting

Common Issues

Issue	Solution
Port already in use	Check if services are running: `lsof -ti:8080`
Database connection error	Ensure PostgreSQL is running: `docker compose ps`
ChromaDB errors	Clear and reindex: `python scripts/manage_vectorstore.py reindex`
Import errors	Check you're in `services/raggr/` directory
Frontend not building	`cd raggr-frontend && yarn install && yarn build`

See individual guides for detailed troubleshooting.

Contributing

Read the Development Guide
Set up your local environment
Make changes and test locally
Generate migrations if needed
Submit a pull request

Additional Resources

Need Help?

Check the relevant guide in this documentation
Review troubleshooting sections
Check application logs: docker compose logs -f
Inspect database: docker compose exec postgres psql -U raggr

Documentation Version: 1.0 Last Updated: January 2026

6.3 KiB Raw Blame History