17 Commits

Author SHA1 Message Date
Ryan Chen
32020a6c60 Replace Ollama with llama-server (OpenAI-compatible API)
- Update llm.py to use OpenAI client with custom base_url for llama-server
- Update agents.py to use ChatOpenAI instead of ChatOllama
- Remove unused ollama imports from main.py, chunker.py, query.py
- Add LLAMA_SERVER_URL and LLAMA_MODEL_NAME env vars
- Remove ollama and langchain-ollama dependencies

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 21:39:23 -05:00
Ryan Chen
713a058c4f Adding roadmap 2026-01-31 17:28:53 -05:00
Ryan Chen
12f7d9ead1 fixing dockerfile 2026-01-31 17:17:56 -05:00
Ryan Chen
ad39904dda reorganization 2026-01-31 17:13:27 -05:00
Ryan Chen
1fd2e860b2 nani 2026-01-31 16:47:57 -05:00
Ryan Chen
7cfad5baba Adding mkdocs and privileged tools 2026-01-31 16:20:35 -05:00
ryan
f68a79bdb7 Add Simba facts to system prompt and Tavily API key config
Expanded the assistant system prompt with comprehensive Simba facts including
medical history, and added TAVILY_KEY env var for web search integration.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 16:08:41 -05:00
ryan
52153cdf1e dockerfile 2026-01-11 17:35:43 -05:00
ryan
6eb3775e0f Merge pull request 'Adding web search infra' (#13) from rc/langchain-migration into main
Reviewed-on: #13
2026-01-11 17:35:36 -05:00
Ryan Chen
b3793d2d32 Adding web search infra 2026-01-11 17:35:05 -05:00
ryan
033429798e Merge pull request 'RAG optimizations' (#12) from rc/langchain-migration into main
Reviewed-on: #12
2026-01-11 09:36:59 -05:00
Ryan Chen
733ffae8cf RAG optimizations 2026-01-11 09:36:36 -05:00
ryan
0895668ddd Merge pull request 'rc/langchain-migration' (#11) from rc/langchain-migration into main
Reviewed-on: #11
2026-01-11 09:22:40 -05:00
Ryan Chen
07512409f1 Adding loading indicator 2026-01-11 09:22:28 -05:00
Ryan Chen
12eb110313 linter 2026-01-11 09:12:37 -05:00
ryan
1a026f76a1 Merge pull request 'okok' (#10) from rc/01012025-retitling into main
Reviewed-on: #10
2026-01-01 22:00:32 -05:00
Ryan Chen
da3a464897 okok 2026-01-01 22:00:12 -05:00
97 changed files with 4078 additions and 573 deletions

BIN
.DS_Store vendored Normal file

Binary file not shown.

View File

@@ -14,12 +14,15 @@ JWT_SECRET_KEY=your-secret-key-here
PAPERLESS_TOKEN=your-paperless-token PAPERLESS_TOKEN=your-paperless-token
BASE_URL=192.168.1.5:8000 BASE_URL=192.168.1.5:8000
# Ollama Configuration # llama-server Configuration (OpenAI-compatible API)
OLLAMA_URL=http://192.168.1.14:11434 # If set, uses llama-server as the primary LLM backend with OpenAI as fallback
OLLAMA_HOST=http://192.168.1.14:11434 LLAMA_SERVER_URL=http://192.168.1.213:8080/v1
LLAMA_MODEL_NAME=llama-3.1-8b-instruct
# ChromaDB Configuration # ChromaDB Configuration
CHROMADB_PATH=/path/to/chromadb # For Docker: This is automatically set to /app/data/chromadb
# For local development: Set to a local directory path
CHROMADB_PATH=./data/chromadb
# OpenAI Configuration # OpenAI Configuration
OPENAI_API_KEY=your-openai-api-key OPENAI_API_KEY=your-openai-api-key

2
.gitignore vendored
View File

@@ -14,5 +14,7 @@ wheels/
# Database files # Database files
chromadb/ chromadb/
chromadb_openai/
chroma_db/
database/ database/
*.db *.db

6
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,6 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.8.2
hooks:
- id: ruff # Linter
- id: ruff-format # Formatter

109
CLAUDE.md Normal file
View File

@@ -0,0 +1,109 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
SimbaRAG is a RAG (Retrieval-Augmented Generation) conversational AI system for querying information about Simba (a cat). It ingests documents from Paperless-NGX, stores embeddings in ChromaDB, and uses LLMs (Ollama or OpenAI) to answer questions.
## Commands
### Development
```bash
# Start dev environment with hot reload
docker compose -f docker-compose.dev.yml up --build
# View logs
docker compose -f docker-compose.dev.yml logs -f raggr
```
### Database Migrations (Aerich/Tortoise ORM)
```bash
# Generate migration (must run in Docker with DB access)
docker compose -f docker-compose.dev.yml exec raggr aerich migrate --name describe_change
# Apply migrations (auto-runs on startup, manual if needed)
docker compose -f docker-compose.dev.yml exec raggr aerich upgrade
# View migration history
docker compose exec raggr aerich history
```
### Frontend
```bash
cd raggr-frontend
yarn install
yarn build # Production build
yarn dev # Dev server (rarely needed, backend serves frontend)
```
### Production
```bash
docker compose build raggr
docker compose up -d
```
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Docker Compose │
├─────────────────────────────────────────────────────────────┤
│ raggr (port 8080) │ postgres (port 5432) │
│ ├── Quart backend │ PostgreSQL 16 │
│ ├── React frontend (served) │ │
│ └── ChromaDB (volume) │ │
└─────────────────────────────────────────────────────────────┘
```
**Backend** (root directory):
- `app.py` - Quart application entry, serves API and static frontend
- `main.py` - RAG logic, document indexing, LLM interaction, LangChain agent
- `llm.py` - LLM client with Ollama primary, OpenAI fallback
- `aerich_config.py` - Database migration configuration
- `blueprints/` - API routes organized as Quart blueprints
- `users/` - OIDC auth, JWT tokens, RBAC with LDAP groups
- `conversation/` - Chat conversations and message history
- `rag/` - Document indexing endpoints (admin-only)
- `config/` - Configuration modules
- `oidc_config.py` - OIDC authentication configuration
- `utils/` - Reusable utilities
- `chunker.py` - Document chunking for embeddings
- `cleaner.py` - PDF cleaning and summarization
- `image_process.py` - Image description with LLM
- `request.py` - Paperless-NGX API client
- `scripts/` - Administrative and utility scripts
- `add_user.py` - Create users manually
- `user_message_stats.py` - User message statistics
- `manage_vectorstore.py` - Vector store management CLI
- `inspect_vector_store.py` - Inspect ChromaDB contents
- `query.py` - Query generation utilities
- `migrations/` - Database migration files
**Frontend** (`raggr-frontend/`):
- React 19 with Rsbuild bundler
- Tailwind CSS for styling
- Built to `dist/`, served by backend at `/`
**Auth Flow**: LLDAP → Authelia (OIDC) → Backend JWT → Frontend localStorage
## Key Patterns
- All endpoints are async (`async def`)
- Use `@jwt_refresh_token_required` for authenticated endpoints
- Use `@admin_required` for admin-only endpoints (checks `lldap_admin` group)
- Tortoise ORM models in `blueprints/*/models.py`
- Frontend API services in `raggr-frontend/src/api/`
## Environment Variables
See `.env.example`. Key ones:
- `DATABASE_URL` - PostgreSQL connection
- `OIDC_*` - Authelia OIDC configuration
- `OLLAMA_URL` - Local LLM server
- `OPENAI_API_KEY` - Fallback LLM
- `PAPERLESS_TOKEN` / `BASE_URL` - Document source

View File

@@ -1,110 +0,0 @@
# Development Environment Setup
This guide explains how to run the application in development mode with hot reload enabled.
## Quick Start
### Development Mode (Hot Reload)
```bash
# Start all services in development mode
docker-compose -f docker-compose.dev.yml up --build
# Or run in detached mode
docker-compose -f docker-compose.dev.yml up -d --build
```
### Production Mode
```bash
# Start production services
docker-compose up --build
```
## What's Different in Dev Mode?
### Backend (Quart/Flask)
- **Hot Reload**: Python code changes are automatically detected and the server restarts
- **Source Mounted**: Your local `services/raggr` directory is mounted as a volume
- **Debug Mode**: Flask runs with `debug=True` for better error messages
- **Environment**: `FLASK_ENV=development` and `PYTHONUNBUFFERED=1` for immediate log output
### Frontend (React + rsbuild)
- **Auto Rebuild**: Frontend automatically rebuilds when files change
- **Watch Mode**: rsbuild runs in watch mode, rebuilding to `dist/` on save
- **Source Mounted**: Your local `services/raggr/raggr-frontend` directory is mounted as a volume
- **Served by Backend**: Built files are served by the backend, no separate dev server
## Ports
- **Application**: 8080 (accessible at `http://localhost:8080` or `http://YOUR_IP:8080`)
The backend serves both the API and the auto-rebuilt frontend, making it accessible from other machines on your network.
## Useful Commands
```bash
# View logs
docker-compose -f docker-compose.dev.yml logs -f
# View logs for specific service
docker-compose -f docker-compose.dev.yml logs -f raggr-backend
docker-compose -f docker-compose.dev.yml logs -f raggr-frontend
# Rebuild after dependency changes
docker-compose -f docker-compose.dev.yml up --build
# Stop all services
docker-compose -f docker-compose.dev.yml down
# Stop and remove volumes (fresh start)
docker-compose -f docker-compose.dev.yml down -v
```
## Making Changes
### Backend Changes
1. Edit any Python file in `services/raggr/`
2. Save the file
3. The Quart server will automatically restart
4. Check logs to confirm reload
### Frontend Changes
1. Edit any file in `services/raggr/raggr-frontend/src/`
2. Save the file
3. The browser will automatically refresh (Hot Module Replacement)
4. No need to rebuild
### Dependency Changes
**Backend** (pyproject.toml):
```bash
# Rebuild the backend service
docker-compose -f docker-compose.dev.yml up --build raggr-backend
```
**Frontend** (package.json):
```bash
# Rebuild the frontend service
docker-compose -f docker-compose.dev.yml up --build raggr-frontend
```
## Troubleshooting
### Port Already in Use
If you see port binding errors, make sure no other services are running on ports 8080 or 3000.
### Changes Not Reflected
1. Check if the file is properly mounted (check docker-compose.dev.yml volumes)
2. Verify the file isn't in an excluded directory (node_modules, __pycache__)
3. Check container logs for errors
### Frontend Not Connecting to Backend
Make sure your frontend API calls point to the correct backend URL. If accessing from the same machine, use `http://localhost:8080`. If accessing from another device on the network, use `http://YOUR_IP:8080`.
## Notes
- Both services bind to `0.0.0.0` and expose ports, making them accessible on your network
- Node modules and Python cache are excluded from volume mounts to use container versions
- Database and ChromaDB data persist in Docker volumes across restarts
- Access the app from any device on your network using your host machine's IP address

View File

@@ -25,6 +25,9 @@ RUN uv pip install --system -e .
COPY *.py ./ COPY *.py ./
COPY blueprints ./blueprints COPY blueprints ./blueprints
COPY migrations ./migrations COPY migrations ./migrations
COPY utils ./utils
COPY config ./config
COPY scripts ./scripts
COPY startup.sh ./ COPY startup.sh ./
RUN chmod +x startup.sh RUN chmod +x startup.sh

53
Dockerfile.dev Normal file
View File

@@ -0,0 +1,53 @@
FROM python:3.13-slim
WORKDIR /app
# Install system dependencies, Node.js, uv, and yarn
RUN apt-get update && apt-get install -y \
build-essential \
curl \
&& curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs \
&& npm install -g yarn \
&& rm -rf /var/lib/apt/lists/* \
&& curl -LsSf https://astral.sh/uv/install.sh | sh
# Add uv to PATH
ENV PATH="/root/.local/bin:$PATH"
# Copy dependency files
COPY pyproject.toml ./
# Install Python dependencies using uv
RUN uv pip install --system -e .
# Copy frontend package files and install dependencies
COPY raggr-frontend/package.json raggr-frontend/yarn.lock* raggr-frontend/
WORKDIR /app/raggr-frontend
RUN yarn install
# Copy application source code
WORKDIR /app
COPY . .
# Build frontend
WORKDIR /app/raggr-frontend
RUN yarn build
# Create ChromaDB and database directories
WORKDIR /app
RUN mkdir -p /app/chromadb /app/database
# Make startup script executable
RUN chmod +x /app/startup-dev.sh
# Set environment variables
ENV PYTHONPATH=/app
ENV CHROMADB_PATH=/app/chromadb
ENV PYTHONUNBUFFERED=1
# Expose port
EXPOSE 8080
# Default command
CMD ["/app/startup-dev.sh"]

371
README.md
View File

@@ -1,7 +1,370 @@
# simbarag # SimbaRAG 🐱
**Goal:** Learn how retrieval-augmented generation works and also create a neat little tool to ask about Simba's health. A Retrieval-Augmented Generation (RAG) conversational AI system for querying information about Simba the cat. Built with LangChain, ChromaDB, and modern web technologies.
**Current objectives:** ## Features
- [ ] Successfully use RAG to ask a question about existing information (e.g. how many teeth has Simba had extracted) - 🤖 **Intelligent Conversations** - LangChain-powered agent with tool use and memory
- 📚 **Document Retrieval** - RAG system using ChromaDB vector store
- 🔍 **Web Search** - Integrated Tavily API for real-time web searches
- 🔐 **OIDC Authentication** - Secure auth via Authelia with LDAP group support
- 💬 **Multi-Conversation** - Manage multiple conversation threads per user
- 🎨 **Modern UI** - React 19 frontend with Tailwind CSS
- 🐳 **Docker Ready** - Containerized deployment with Docker Compose
## System Architecture
```mermaid
graph TB
subgraph "Client Layer"
Browser[Web Browser]
end
subgraph "Frontend - React"
UI[React UI<br/>Tailwind CSS]
Auth[Auth Service]
API[API Client]
end
subgraph "Backend - Quart/Python"
App[Quart App<br/>app.py]
subgraph "Blueprints"
Users[Users Blueprint<br/>OIDC + JWT]
Conv[Conversation Blueprint<br/>Chat Management]
RAG[RAG Blueprint<br/>Document Indexing]
end
Agent[LangChain Agent<br/>main.py]
LLM[LLM Client<br/>llm.py]
end
subgraph "Tools & Utilities"
Search[Simba Search Tool]
Web[Web Search Tool<br/>Tavily]
end
subgraph "Data Layer"
Postgres[(PostgreSQL<br/>Users & Conversations)]
Chroma[(ChromaDB<br/>Vector Store)]
end
subgraph "External Services"
Authelia[Authelia<br/>OIDC Provider]
LLDAP[LLDAP<br/>User Directory]
Ollama[Ollama<br/>Local LLM]
OpenAI[OpenAI<br/>Fallback LLM]
Paperless[Paperless-NGX<br/>Documents]
TavilyAPI[Tavily API<br/>Web Search]
end
Browser --> UI
UI --> Auth
UI --> API
API --> App
App --> Users
App --> Conv
App --> RAG
Conv --> Agent
Agent --> Search
Agent --> Web
Agent --> LLM
Search --> Chroma
Web --> TavilyAPI
RAG --> Chroma
RAG --> Paperless
Users --> Postgres
Conv --> Postgres
Users --> Authelia
Authelia --> LLDAP
LLM --> Ollama
LLM -.Fallback.-> OpenAI
style Browser fill:#e1f5ff
style UI fill:#fff3cd
style App fill:#d4edda
style Agent fill:#d4edda
style Postgres fill:#f8d7da
style Chroma fill:#f8d7da
style Ollama fill:#e2e3e5
style OpenAI fill:#e2e3e5
```
## Quick Start
### Prerequisites
- Docker & Docker Compose
- PostgreSQL (or use Docker)
- Ollama (optional, for local LLM)
- Paperless-NGX instance (for document source)
### Installation
1. **Clone the repository**
```bash
git clone https://github.com/yourusername/simbarag.git
cd simbarag
```
2. **Configure environment variables**
```bash
cp .env.example .env
# Edit .env with your configuration
```
3. **Start the services**
```bash
# Development (local PostgreSQL only)
docker compose -f docker-compose.dev.yml up -d
# Or full Docker deployment
docker compose up -d
```
4. **Access the application**
Open `http://localhost:8080` in your browser.
## Development
### Local Development Setup
```bash
# 1. Start PostgreSQL
docker compose -f docker-compose.dev.yml up -d
# 2. Set environment variables
export DATABASE_URL="postgres://raggr:raggr_dev_password@localhost:5432/raggr"
export CHROMADB_PATH="./chromadb"
export $(grep -v '^#' .env | xargs)
# 3. Install dependencies
pip install -r requirements.txt
cd raggr-frontend && yarn install && yarn build && cd ..
# 4. Run migrations
aerich upgrade
# 5. Start the server
python app.py
```
See [docs/development.md](docs/development.md) for detailed development guide.
## Project Structure
```
simbarag/
├── app.py # Quart application entry point
├── main.py # RAG logic & LangChain agent
├── llm.py # LLM client with Ollama/OpenAI
├── aerich_config.py # Database migration configuration
├── blueprints/ # API route blueprints
│ ├── users/ # Authentication & authorization
│ ├── conversation/ # Chat conversations
│ └── rag/ # Document indexing
├── config/ # Configuration modules
│ └── oidc_config.py # OIDC authentication settings
├── utils/ # Reusable utilities
│ ├── chunker.py # Document chunking for embeddings
│ ├── cleaner.py # PDF cleaning and summarization
│ ├── image_process.py # Image description with LLM
│ └── request.py # Paperless-NGX API client
├── scripts/ # Administrative scripts
│ ├── add_user.py
│ ├── user_message_stats.py
│ ├── manage_vectorstore.py
│ └── inspect_vector_store.py
├── raggr-frontend/ # React frontend
│ └── src/
├── migrations/ # Database migrations
├── docs/ # Documentation
│ ├── index.md # Documentation hub
│ ├── development.md # Development guide
│ ├── deployment.md # Deployment & migrations
│ ├── VECTORSTORE.md # Vector store management
│ ├── MIGRATIONS.md # Migration reference
│ └── authentication.md # Authentication setup
├── docker-compose.yml # Production compose
├── docker-compose.dev.yml # Development compose
├── Dockerfile # Production Dockerfile
├── Dockerfile.dev # Development Dockerfile
├── CLAUDE.md # AI assistant instructions
└── README.md # This file
```
## Key Technologies
### Backend
- **Quart** - Async Python web framework
- **LangChain** - Agent framework with tool use
- **Tortoise ORM** - Async ORM for PostgreSQL
- **Aerich** - Database migration tool
- **ChromaDB** - Vector database for embeddings
- **OpenAI** - Embeddings & LLM (fallback)
- **Ollama** - Local LLM (primary)
### Frontend
- **React 19** - UI framework
- **Rsbuild** - Fast bundler
- **Tailwind CSS** - Utility-first styling
- **Axios** - HTTP client
### Authentication
- **Authelia** - OIDC provider
- **LLDAP** - Lightweight LDAP server
- **JWT** - Token-based auth
## API Endpoints
### Authentication
- `GET /api/user/oidc/login` - Initiate OIDC login
- `GET /api/user/oidc/callback` - OIDC callback handler
- `POST /api/user/refresh` - Refresh JWT token
### Conversations
- `POST /api/conversation/` - Create conversation
- `GET /api/conversation/` - List conversations
- `GET /api/conversation/<id>` - Get conversation with messages
- `POST /api/conversation/query` - Send message and get response
### RAG (Admin Only)
- `GET /api/rag/stats` - Vector store statistics
- `POST /api/rag/index` - Index new documents
- `POST /api/rag/reindex` - Clear and reindex all
## Configuration
### Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `DATABASE_URL` | PostgreSQL connection string | `postgres://...` |
| `CHROMADB_PATH` | ChromaDB storage path | `./chromadb` |
| `OLLAMA_URL` | Ollama server URL | `http://localhost:11434` |
| `OPENAI_API_KEY` | OpenAI API key | - |
| `PAPERLESS_TOKEN` | Paperless-NGX API token | - |
| `BASE_URL` | Paperless-NGX base URL | - |
| `OIDC_ISSUER` | OIDC provider URL | - |
| `OIDC_CLIENT_ID` | OIDC client ID | - |
| `OIDC_CLIENT_SECRET` | OIDC client secret | - |
| `JWT_SECRET_KEY` | JWT signing key | - |
| `TAVILY_KEY` | Tavily web search API key | - |
See `.env.example` for full list.
## Scripts
### User Management
```bash
# Add a new user
python scripts/add_user.py
# View message statistics
python scripts/user_message_stats.py
```
### Vector Store Management
```bash
# Show vector store statistics
python scripts/manage_vectorstore.py stats
# Index new documents from Paperless
python scripts/manage_vectorstore.py index
# Clear and reindex everything
python scripts/manage_vectorstore.py reindex
# Inspect vector store contents
python scripts/inspect_vector_store.py
```
See [docs/vectorstore.md](docs/vectorstore.md) for details.
## Database Migrations
```bash
# Generate a new migration
aerich migrate --name "describe_your_changes"
# Apply pending migrations
aerich upgrade
# View migration history
aerich history
# Rollback last migration
aerich downgrade
```
See [docs/deployment.md](docs/deployment.md) for detailed migration workflows.
## LangChain Agent
The conversational agent has access to two tools:
1. **simba_search** - Query the vector store for Simba's documents
- Used for: Medical records, veterinary history, factual information
2. **web_search** - Search the web via Tavily API
- Used for: Recent events, external knowledge, general questions
The agent automatically selects the appropriate tool based on the user's query.
## Authentication Flow
```
User → Authelia (OIDC) → Backend (JWT) → Frontend (localStorage)
LLDAP
```
1. User clicks "Login"
2. Frontend redirects to Authelia
3. User authenticates via Authelia (backed by LLDAP)
4. Authelia redirects back with authorization code
5. Backend exchanges code for OIDC tokens
6. Backend issues JWT tokens
7. Frontend stores tokens in localStorage
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests and linting
5. Submit a pull request
## Documentation
- [Development Guide](docs/development.md) - Setup and development workflow
- [Deployment Guide](docs/deployment.md) - Deployment and migrations
- [Vector Store Guide](docs/vectorstore.md) - Managing the vector database
- [Authentication Guide](docs/authentication.md) - OIDC and LDAP setup
## License
[Your License Here]
## Acknowledgments
- Built for Simba, the most important cat in the world 🐱
- Powered by LangChain, ChromaDB, and the open-source community

View File

@@ -6,6 +6,7 @@ from tortoise.contrib.quart import register_tortoise
import blueprints.conversation import blueprints.conversation
import blueprints.conversation.logic import blueprints.conversation.logic
import blueprints.rag
import blueprints.users import blueprints.users
import blueprints.users.models import blueprints.users.models
from main import consult_simba_oracle from main import consult_simba_oracle
@@ -22,12 +23,12 @@ jwt = JWTManager(app)
# Register blueprints # Register blueprints
app.register_blueprint(blueprints.users.user_blueprint) app.register_blueprint(blueprints.users.user_blueprint)
app.register_blueprint(blueprints.conversation.conversation_blueprint) app.register_blueprint(blueprints.conversation.conversation_blueprint)
app.register_blueprint(blueprints.rag.rag_blueprint)
# Database configuration with environment variable support # Database configuration with environment variable support
DATABASE_URL = os.getenv( DATABASE_URL = os.getenv(
"DATABASE_URL", "DATABASE_URL", "postgres://raggr:raggr_dev_password@localhost:5432/raggr"
"postgres://raggr:raggr_dev_password@localhost:5432/raggr"
) )
TORTOISE_CONFIG = { TORTOISE_CONFIG = {
@@ -123,10 +124,17 @@ async def get_messages():
} }
) )
name = conversation.name
if len(messages) > 8:
name = await blueprints.conversation.logic.rename_conversation(
user=user,
conversation=conversation,
)
return jsonify( return jsonify(
{ {
"id": str(conversation.id), "id": str(conversation.id),
"name": conversation.name, "name": name,
"messages": messages, "messages": messages,
"created_at": conversation.created_at.isoformat(), "created_at": conversation.created_at.isoformat(),
"updated_at": conversation.updated_at.isoformat(), "updated_at": conversation.updated_at.isoformat(),

View File

@@ -0,0 +1,172 @@
import datetime
from quart import Blueprint, jsonify, request
from quart_jwt_extended import (
get_jwt_identity,
jwt_refresh_token_required,
)
import blueprints.users.models
from .agents import main_agent
from .logic import (
add_message_to_conversation,
get_conversation_by_id,
rename_conversation,
)
from .models import (
Conversation,
PydConversation,
PydListConversation,
)
conversation_blueprint = Blueprint(
"conversation_api", __name__, url_prefix="/api/conversation"
)
@conversation_blueprint.post("/query")
@jwt_refresh_token_required
async def query():
current_user_uuid = get_jwt_identity()
user = await blueprints.users.models.User.get(id=current_user_uuid)
data = await request.get_json()
query = data.get("query")
conversation_id = data.get("conversation_id")
conversation = await get_conversation_by_id(conversation_id)
await conversation.fetch_related("messages")
await add_message_to_conversation(
conversation=conversation,
message=query,
speaker="user",
user=user,
)
# Build conversation history from recent messages (last 10 for context)
recent_messages = (
conversation.messages[-10:]
if len(conversation.messages) > 10
else conversation.messages
)
messages_payload = [
{
"role": "system",
"content": """You are a helpful cat assistant named Simba that understands veterinary terms. When there are questions to you specifically, they are referring to Simba the cat. Answer the user in as if you were a cat named Simba. Don't act too catlike. Be assertive.
SIMBA FACTS (as of January 2026):
- Name: Simba
- Species: Feline (Domestic Short Hair / American Short Hair)
- Sex: Male, Neutered
- Date of Birth: August 8, 2016 (approximately 9 years 5 months old)
- Color: Orange
- Current Weight: 16 lbs (as of 1/8/2026)
- Owner: Ryan Chen
- Location: Long Island City, NY
- Veterinarian: Court Square Animal Hospital
Medical Conditions:
- Hypertrophic Cardiomyopathy (HCM): Diagnosed 12/11/2025. Concentric left ventricular hypertrophy with no left atrial dilation. Grade II-III/VI systolic heart murmur. No cardiac medications currently needed. Must avoid Domitor, acepromazine, and ketamine during anesthesia.
- Dental Issues: Prior extraction of teeth 307 and 407 due to resorption. Tooth 107 extracted on 1/8/2026. Early resorption lesions present on teeth 207, 309, and 409.
Recent Medical Events:
- 1/8/2026: Dental cleaning and tooth 107 extraction. Prescribed Onsior for 3 days. Oravet sealant applied.
- 12/11/2025: Echocardiogram confirming HCM diagnosis. Pre-op bloodwork was normal.
- 12/1/2025: Visited for decreased appetite/nausea. Received subcutaneous fluids and Cerenia.
Diet & Lifestyle:
- Diet: Hill's I/D wet and dry food
- Supplements: Plaque Off
- Indoor only cat, only pet in the household
Upcoming Appointments:
- Rabies Vaccine: Due 2/19/2026
- Routine Examination: Due 6/1/2026
- FVRCP-3yr Vaccine: Due 10/2/2026
IMPORTANT: When users ask factual questions about Simba's health, medical history, veterinary visits, medications, weight, or any information that would be in documents, you MUST use the simba_search tool to retrieve accurate information before answering. Do not rely on general knowledge - always search the documents for factual questions.""",
}
]
# Add recent conversation history
for msg in recent_messages[:-1]: # Exclude the message we just added
role = "user" if msg.speaker == "user" else "assistant"
messages_payload.append({"role": role, "content": msg.text})
# Add current query
messages_payload.append({"role": "user", "content": query})
payload = {"messages": messages_payload}
response = await main_agent.ainvoke(payload)
message = response.get("messages", [])[-1].content
await add_message_to_conversation(
conversation=conversation,
message=message,
speaker="simba",
user=user,
)
return jsonify({"response": message})
@conversation_blueprint.route("/<conversation_id>")
@jwt_refresh_token_required
async def get_conversation(conversation_id: str):
conversation = await Conversation.get(id=conversation_id)
current_user_uuid = get_jwt_identity()
user = await blueprints.users.models.User.get(id=current_user_uuid)
await conversation.fetch_related("messages")
# Manually serialize the conversation with messages
messages = []
for msg in conversation.messages:
messages.append(
{
"id": str(msg.id),
"text": msg.text,
"speaker": msg.speaker.value,
"created_at": msg.created_at.isoformat(),
}
)
name = conversation.name
if len(messages) > 8 and "datetime" in name.lower():
name = await rename_conversation(
user=user,
conversation=conversation,
)
print(name)
return jsonify(
{
"id": str(conversation.id),
"name": name,
"messages": messages,
"created_at": conversation.created_at.isoformat(),
"updated_at": conversation.updated_at.isoformat(),
}
)
@conversation_blueprint.post("/")
@jwt_refresh_token_required
async def create_conversation():
user_uuid = get_jwt_identity()
user = await blueprints.users.models.User.get(id=user_uuid)
conversation = await Conversation.create(
name=f"{user.username} {datetime.datetime.now().timestamp}",
user=user,
)
serialized_conversation = await PydConversation.from_tortoise_orm(conversation)
return jsonify(serialized_conversation.model_dump())
@conversation_blueprint.get("/")
@jwt_refresh_token_required
async def get_all_conversations():
user_uuid = get_jwt_identity()
user = await blueprints.users.models.User.get(id=user_uuid)
conversations = Conversation.filter(user=user)
serialized_conversations = await PydListConversation.from_queryset(conversations)
return jsonify(serialized_conversations.model_dump())

View File

@@ -0,0 +1,88 @@
import os
from typing import cast
from langchain.agents import create_agent
from langchain.chat_models import BaseChatModel
from langchain.tools import tool
from langchain_openai import ChatOpenAI
from tavily import AsyncTavilyClient
from blueprints.rag.logic import query_vector_store
# Configure LLM with llama-server or OpenAI fallback
llama_url = os.getenv("LLAMA_SERVER_URL")
if llama_url:
llama_chat = ChatOpenAI(
base_url=llama_url,
api_key="not-needed",
model=os.getenv("LLAMA_MODEL_NAME", "llama-3.1-8b-instruct"),
)
else:
llama_chat = None
openai_fallback = ChatOpenAI(model="gpt-5-mini")
model_with_fallback = cast(
BaseChatModel,
llama_chat.with_fallbacks([openai_fallback]) if llama_chat else openai_fallback,
)
client = AsyncTavilyClient(os.getenv("TAVILY_KEY"), "")
@tool
async def web_search(query: str) -> str:
"""Search the web for current information using Tavily.
Use this tool when you need to:
- Find current information not in the knowledge base
- Look up recent events, news, or updates
- Verify facts or get additional context
- Search for information outside of Simba's documents
Args:
query: The search query to look up on the web
Returns:
Search results from the web with titles, content, and source URLs
"""
response = await client.search(query=query, search_depth="basic")
results = response.get("results", [])
if not results:
return "No results found for the query."
formatted = "\n\n".join(
[
f"**{result['title']}**\n{result['content']}\nSource: {result['url']}"
for result in results[:5]
]
)
return formatted
@tool(response_format="content_and_artifact")
async def simba_search(query: str):
"""Search through Simba's medical records, veterinary documents, and personal information.
Use this tool whenever the user asks questions about:
- Simba's health history, medical records, or veterinary visits
- Medications, treatments, or diagnoses
- Weight, diet, or physical characteristics over time
- Veterinary recommendations or advice
- Ryan's (the owner's) information related to Simba
- Any factual information that would be found in documents
Args:
query: The user's question or information need about Simba
Returns:
Relevant information from Simba's documents
"""
print(f"[SIMBA SEARCH] Tool called with query: {query}")
serialized, docs = await query_vector_store(query=query)
print(f"[SIMBA SEARCH] Found {len(docs)} documents")
print(f"[SIMBA SEARCH] Serialized result length: {len(serialized)}")
print(f"[SIMBA SEARCH] First 200 chars: {serialized[:200]}")
return serialized, docs
main_agent = create_agent(model=model_with_fallback, tools=[simba_search, web_search])

View File

@@ -1,9 +1,10 @@
import tortoise.exceptions import tortoise.exceptions
from langchain_openai import ChatOpenAI
from .models import Conversation, ConversationMessage
import blueprints.users.models import blueprints.users.models
from .models import Conversation, ConversationMessage, RenameConversationOutputSchema
async def create_conversation(name: str = "") -> Conversation: async def create_conversation(name: str = "") -> Conversation:
conversation = await Conversation.create(name=name) conversation = await Conversation.create(name=name)
@@ -58,3 +59,22 @@ async def get_conversation_transcript(
messages.append(f"{message.speaker} at {message.created_at}: {message.text}") messages.append(f"{message.speaker} at {message.created_at}: {message.text}")
return "\n".join(messages) return "\n".join(messages)
async def rename_conversation(
user: blueprints.users.models.User,
conversation: Conversation,
) -> str:
messages: str = await get_conversation_transcript(
user=user, conversation=conversation
)
llm = ChatOpenAI(model="gpt-4o-mini")
structured_llm = llm.with_structured_output(RenameConversationOutputSchema)
prompt = f"Summarize the following conversation into a sassy one-liner title:\n\n{messages}"
response = structured_llm.invoke(prompt)
new_name: str = response.get("title", "")
conversation.name = new_name
await conversation.save()
return new_name

View File

@@ -1,11 +1,18 @@
import enum import enum
from dataclasses import dataclass
from tortoise.models import Model
from tortoise import fields from tortoise import fields
from tortoise.contrib.pydantic import ( from tortoise.contrib.pydantic import (
pydantic_queryset_creator,
pydantic_model_creator, pydantic_model_creator,
pydantic_queryset_creator,
) )
from tortoise.models import Model
@dataclass
class RenameConversationOutputSchema:
title: str
justification: str
class Speaker(enum.Enum): class Speaker(enum.Enum):

View File

@@ -0,0 +1,47 @@
from quart import Blueprint, jsonify
from quart_jwt_extended import jwt_refresh_token_required
from .logic import get_vector_store_stats, index_documents, vector_store
from blueprints.users.decorators import admin_required
rag_blueprint = Blueprint("rag_api", __name__, url_prefix="/api/rag")
@rag_blueprint.get("/stats")
@jwt_refresh_token_required
async def get_stats():
"""Get vector store statistics."""
stats = get_vector_store_stats()
return jsonify(stats)
@rag_blueprint.post("/index")
@admin_required
async def trigger_index():
"""Trigger indexing of documents from Paperless-NGX. Admin only."""
try:
await index_documents()
stats = get_vector_store_stats()
return jsonify({"status": "success", "stats": stats})
except Exception as e:
return jsonify({"status": "error", "message": str(e)}), 500
@rag_blueprint.post("/reindex")
@admin_required
async def trigger_reindex():
"""Clear and reindex all documents. Admin only."""
try:
# Clear existing documents
collection = vector_store._collection
all_docs = collection.get()
if all_docs["ids"]:
collection.delete(ids=all_docs["ids"])
# Reindex
await index_documents()
stats = get_vector_store_stats()
return jsonify({"status": "success", "stats": stats})
except Exception as e:
return jsonify({"status": "error", "message": str(e)}), 500

View File

@@ -0,0 +1,75 @@
import os
import tempfile
import httpx
class PaperlessNGXService:
def __init__(self):
self.base_url = os.getenv("BASE_URL")
self.token = os.getenv("PAPERLESS_TOKEN")
self.url = f"http://{os.getenv('BASE_URL')}/api/documents/?tags__id=8"
self.headers = {"Authorization": f"Token {os.getenv('PAPERLESS_TOKEN')}"}
def get_data(self):
print(f"Getting data from: {self.url}")
r = httpx.get(self.url, headers=self.headers)
results = r.json()["results"]
nextLink = r.json().get("next")
while nextLink:
r = httpx.get(nextLink, headers=self.headers)
results += r.json()["results"]
nextLink = r.json().get("next")
return results
def get_doc_by_id(self, doc_id: int):
url = f"http://{os.getenv('BASE_URL')}/api/documents/{doc_id}/"
r = httpx.get(url, headers=self.headers)
return r.json()
def download_pdf_from_id(self, id: int) -> str:
download_url = f"http://{os.getenv('BASE_URL')}/api/documents/{id}/download/"
response = httpx.get(
download_url, headers=self.headers, follow_redirects=True, timeout=30
)
response.raise_for_status()
# Use a temporary file for the downloaded PDF
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".pdf")
temp_file.write(response.content)
temp_file.close()
temp_pdf_path = temp_file.name
pdf_to_process = temp_pdf_path
return pdf_to_process
def upload_cleaned_content(self, document_id, data):
PUTS_URL = f"http://{os.getenv('BASE_URL')}/api/documents/{document_id}/"
r = httpx.put(PUTS_URL, headers=self.headers, data=data)
r.raise_for_status()
def upload_description(self, description_filepath, file, title, exif_date: str):
POST_URL = f"http://{os.getenv('BASE_URL')}/api/documents/post_document/"
files = {"document": ("description_filepath", file, "application/txt")}
data = {
"title": title,
"create": exif_date,
"document_type": 3,
"tags": [7],
}
r = httpx.post(POST_URL, headers=self.headers, data=data, files=files)
r.raise_for_status()
def get_tags(self):
GET_URL = f"http://{os.getenv('BASE_URL')}/api/tags/"
r = httpx.get(GET_URL, headers=self.headers)
data = r.json()
return {tag["id"]: tag["name"] for tag in data["results"]}
def get_doctypes(self):
GET_URL = f"http://{os.getenv('BASE_URL')}/api/document_types/"
r = httpx.get(GET_URL, headers=self.headers)
data = r.json()
return {doctype["id"]: doctype["name"] for doctype in data["results"]}

101
blueprints/rag/logic.py Normal file
View File

@@ -0,0 +1,101 @@
import datetime
import os
from langchain_chroma import Chroma
from langchain_core.documents import Document
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from .fetchers import PaperlessNGXService
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vector_store = Chroma(
collection_name="simba_docs",
embedding_function=embeddings,
persist_directory=os.getenv("CHROMADB_PATH", ""),
)
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # chunk size (characters)
chunk_overlap=200, # chunk overlap (characters)
add_start_index=True, # track index in original document
)
def date_to_epoch(date_str: str) -> float:
split_date = date_str.split("-")
date = datetime.datetime(
int(split_date[0]),
int(split_date[1]),
int(split_date[2]),
0,
0,
0,
)
return date.timestamp()
async def fetch_documents_from_paperless_ngx() -> list[Document]:
ppngx = PaperlessNGXService()
data = ppngx.get_data()
doctypes = ppngx.get_doctypes()
documents = []
for doc in data:
metadata = {
"created_date": date_to_epoch(doc["created_date"]),
"filename": doc["original_file_name"],
"document_type": doctypes.get(doc["document_type"], ""),
}
documents.append(Document(page_content=doc["content"], metadata=metadata))
return documents
async def index_documents():
documents = await fetch_documents_from_paperless_ngx()
splits = text_splitter.split_documents(documents)
await vector_store.aadd_documents(documents=splits)
async def query_vector_store(query: str):
retrieved_docs = await vector_store.asimilarity_search(query, k=2)
serialized = "\n\n".join(
(f"Source: {doc.metadata}\nContent: {doc.page_content}")
for doc in retrieved_docs
)
return serialized, retrieved_docs
def get_vector_store_stats():
"""Get statistics about the vector store."""
collection = vector_store._collection
count = collection.count()
return {
"total_documents": count,
"collection_name": collection.name,
}
def list_all_documents(limit: int = 10):
"""List documents in the vector store with their metadata."""
collection = vector_store._collection
results = collection.get(limit=limit, include=["metadatas", "documents"])
documents = []
for i, doc_id in enumerate(results["ids"]):
documents.append(
{
"id": doc_id,
"metadata": results["metadatas"][i]
if results.get("metadatas")
else None,
"content_preview": results["documents"][i][:200]
if results.get("documents")
else None,
}
)
return documents

0
blueprints/rag/models.py Normal file
View File

View File

@@ -7,7 +7,7 @@ from quart_jwt_extended import (
) )
from .models import User from .models import User
from .oidc_service import OIDCUserService from .oidc_service import OIDCUserService
from oidc_config import oidc_config from config.oidc_config import oidc_config
import secrets import secrets
import httpx import httpx
from urllib.parse import urlencode from urllib.parse import urlencode
@@ -60,7 +60,7 @@ async def oidc_login():
"client_id": oidc_config.client_id, "client_id": oidc_config.client_id,
"response_type": "code", "response_type": "code",
"redirect_uri": oidc_config.redirect_uri, "redirect_uri": oidc_config.redirect_uri,
"scope": "openid email profile", "scope": "openid email profile groups",
"state": state, "state": state,
"code_challenge": code_challenge, "code_challenge": code_challenge,
"code_challenge_method": "S256", "code_challenge_method": "S256",
@@ -115,7 +115,9 @@ async def oidc_callback():
token_response = await client.post(token_endpoint, data=token_data) token_response = await client.post(token_endpoint, data=token_data)
if token_response.status_code != 200: if token_response.status_code != 200:
return jsonify({"error": f"Failed to exchange code for token: {token_response.text}"}), 400 return jsonify(
{"error": f"Failed to exchange code for token: {token_response.text}"}
), 400
tokens = token_response.json() tokens = token_response.json()
@@ -141,7 +143,13 @@ async def oidc_callback():
return jsonify( return jsonify(
access_token=access_token, access_token=access_token,
refresh_token=refresh_token, refresh_token=refresh_token,
user={"id": str(user.id), "username": user.username, "email": user.email}, user={
"id": str(user.id),
"username": user.username,
"email": user.email,
"groups": user.ldap_groups,
"is_admin": user.is_admin(),
},
) )

View File

@@ -0,0 +1,26 @@
"""
Authentication decorators for role-based access control.
"""
from functools import wraps
from quart import jsonify
from quart_jwt_extended import jwt_refresh_token_required, get_jwt_identity
from .models import User
def admin_required(fn):
"""
Decorator that requires the user to be an admin (member of lldap_admin group).
Must be used on async route handlers.
"""
@wraps(fn)
@jwt_refresh_token_required
async def wrapper(*args, **kwargs):
user_id = get_jwt_identity()
user = await User.get_or_none(id=user_id)
if not user or not user.is_admin():
return jsonify({"error": "Admin access required"}), 403
return await fn(*args, **kwargs)
return wrapper

View File

@@ -12,8 +12,13 @@ class User(Model):
email = fields.CharField(max_length=100, unique=True) email = fields.CharField(max_length=100, unique=True)
# OIDC fields # OIDC fields
oidc_subject = fields.CharField(max_length=255, unique=True, null=True, index=True) # "sub" claim from OIDC oidc_subject = fields.CharField(
auth_provider = fields.CharField(max_length=50, default="local") # "local" or "oidc" max_length=255, unique=True, null=True, index=True
) # "sub" claim from OIDC
auth_provider = fields.CharField(
max_length=50, default="local"
) # "local" or "oidc"
ldap_groups = fields.JSONField(default=[]) # LDAP groups from OIDC claims
created_at = fields.DatetimeField(auto_now_add=True) created_at = fields.DatetimeField(auto_now_add=True)
updated_at = fields.DatetimeField(auto_now=True) updated_at = fields.DatetimeField(auto_now=True)
@@ -21,6 +26,14 @@ class User(Model):
class Meta: class Meta:
table = "users" table = "users"
def has_group(self, group: str) -> bool:
"""Check if user belongs to a specific LDAP group."""
return group in (self.ldap_groups or [])
def is_admin(self) -> bool:
"""Check if user is an admin (member of lldap_admin group)."""
return self.has_group("lldap_admin")
def set_password(self, plain_password: str): def set_password(self, plain_password: str):
self.password = bcrypt.hashpw( self.password = bcrypt.hashpw(
plain_password.encode("utf-8"), plain_password.encode("utf-8"),

View File

@@ -1,6 +1,7 @@
""" """
OIDC User Management Service OIDC User Management Service
""" """
from typing import Dict, Any, Optional from typing import Dict, Any, Optional
from uuid import uuid4 from uuid import uuid4
from .models import User from .models import User
@@ -31,10 +32,10 @@ class OIDCUserService:
# Update user info from latest claims (optional) # Update user info from latest claims (optional)
user.email = claims.get("email", user.email) user.email = claims.get("email", user.email)
user.username = ( user.username = (
claims.get("preferred_username") claims.get("preferred_username") or claims.get("name") or user.username
or claims.get("name")
or user.username
) )
# Update LDAP groups from claims
user.ldap_groups = claims.get("groups", [])
await user.save() await user.save()
return user return user
@@ -47,6 +48,7 @@ class OIDCUserService:
user.oidc_subject = oidc_subject user.oidc_subject = oidc_subject
user.auth_provider = "oidc" user.auth_provider = "oidc"
user.password = None # Clear password user.password = None # Clear password
user.ldap_groups = claims.get("groups", [])
await user.save() await user.save()
return user return user
@@ -58,14 +60,17 @@ class OIDCUserService:
or f"user_{oidc_subject[:8]}" or f"user_{oidc_subject[:8]}"
) )
# Extract LDAP groups from claims
groups = claims.get("groups", [])
user = await User.create( user = await User.create(
id=uuid4(), id=uuid4(),
username=username, username=username,
email=email email=email or f"{oidc_subject}@oidc.local", # Fallback if no email claim
or f"{oidc_subject}@oidc.local", # Fallback if no email claim
oidc_subject=oidc_subject, oidc_subject=oidc_subject,
auth_provider="oidc", auth_provider="oidc",
password=None, password=None,
ldap_groups=groups,
) )
return user return user

0
config/__init__.py Normal file
View File

View File

@@ -1,6 +1,7 @@
""" """
OIDC Configuration for Authelia Integration OIDC Configuration for Authelia Integration
""" """
import os import os
from typing import Dict, Any from typing import Dict, Any
from authlib.jose import jwt from authlib.jose import jwt

View File

@@ -1,5 +1,3 @@
version: "3.8"
services: services:
postgres: postgres:
image: postgres:16-alpine image: postgres:16-alpine
@@ -17,55 +15,56 @@ services:
timeout: 5s timeout: 5s
retries: 5 retries: 5
raggr-backend: # raggr service disabled - run locally for development
build: # raggr:
context: ./services/raggr # build:
dockerfile: Dockerfile.dev # context: .
image: torrtle/simbarag:dev # dockerfile: Dockerfile.dev
ports: # image: torrtle/simbarag:dev
- "8080:8080" # ports:
env_file: # - "8080:8080"
- .env # env_file:
environment: # - .env
- PAPERLESS_TOKEN=${PAPERLESS_TOKEN} # environment:
- BASE_URL=${BASE_URL} # - PAPERLESS_TOKEN=${PAPERLESS_TOKEN}
- OLLAMA_URL=${OLLAMA_URL:-http://localhost:11434} # - BASE_URL=${BASE_URL}
- CHROMADB_PATH=/app/chromadb # - OLLAMA_URL=${OLLAMA_URL:-http://localhost:11434}
- OPENAI_API_KEY=${OPENAI_API_KEY} # - CHROMADB_PATH=/app/data/chromadb
- JWT_SECRET_KEY=${JWT_SECRET_KEY} # - OPENAI_API_KEY=${OPENAI_API_KEY}
- OIDC_ISSUER=${OIDC_ISSUER} # - JWT_SECRET_KEY=${JWT_SECRET_KEY}
- OIDC_CLIENT_ID=${OIDC_CLIENT_ID} # - OIDC_ISSUER=${OIDC_ISSUER}
- OIDC_CLIENT_SECRET=${OIDC_CLIENT_SECRET} # - OIDC_CLIENT_ID=${OIDC_CLIENT_ID}
- OIDC_REDIRECT_URI=${OIDC_REDIRECT_URI} # - OIDC_CLIENT_SECRET=${OIDC_CLIENT_SECRET}
- OIDC_USE_DISCOVERY=${OIDC_USE_DISCOVERY:-true} # - OIDC_REDIRECT_URI=${OIDC_REDIRECT_URI}
- DATABASE_URL=postgres://raggr:raggr_dev_password@postgres:5432/raggr # - OIDC_USE_DISCOVERY=${OIDC_USE_DISCOVERY:-true}
- FLASK_ENV=development # - DATABASE_URL=postgres://raggr:raggr_dev_password@postgres:5432/raggr
- PYTHONUNBUFFERED=1 # - FLASK_ENV=development
depends_on: # - PYTHONUNBUFFERED=1
postgres: # - NODE_ENV=development
condition: service_healthy # - TAVILY_KEY=${TAVILIY_KEY}
volumes: # depends_on:
# Mount source code for hot reload # postgres:
- ./services/raggr:/app # condition: service_healthy
# Exclude node_modules and Python cache # volumes:
- /app/raggr-frontend/node_modules # - chromadb_data:/app/data/chromadb
- /app/__pycache__ # - ./migrations:/app/migrations # Bind mount for migrations (bidirectional)
# Persist data # develop:
- chromadb_data:/app/chromadb # watch:
command: sh -c "chmod +x /app/startup-dev.sh && /app/startup-dev.sh" # # Sync+restart on any file change in root directory
# - action: sync+restart
raggr-frontend: # path: .
build: # target: /app
context: ./services/raggr/raggr-frontend # ignore:
dockerfile: Dockerfile.dev # - __pycache__/
environment: # - "*.pyc"
- NODE_ENV=development # - "*.pyo"
volumes: # - "*.pyd"
# Mount source code for hot reload # - .git/
- ./services/raggr/raggr-frontend:/app # - chromadb/
# Exclude node_modules to use container's version # - node_modules/
- /app/node_modules # - raggr-frontend/dist/
command: sh -c "yarn build && yarn watch:build" # - docs/
# - .venv/
volumes: volumes:
chromadb_data: chromadb_data:

View File

@@ -3,6 +3,8 @@ version: "3.8"
services: services:
postgres: postgres:
image: postgres:16-alpine image: postgres:16-alpine
ports:
- "5432:5432"
environment: environment:
- POSTGRES_USER=${POSTGRES_USER:-raggr} - POSTGRES_USER=${POSTGRES_USER:-raggr}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-changeme} - POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-changeme}
@@ -18,15 +20,16 @@ services:
raggr: raggr:
build: build:
context: ./services/raggr context: .
dockerfile: Dockerfile dockerfile: Dockerfile
image: torrtle/simbarag:latest image: torrtle/simbarag:latest
network_mode: host ports:
- "8080:8080"
environment: environment:
- PAPERLESS_TOKEN=${PAPERLESS_TOKEN} - PAPERLESS_TOKEN=${PAPERLESS_TOKEN}
- BASE_URL=${BASE_URL} - BASE_URL=${BASE_URL}
- OLLAMA_URL=${OLLAMA_URL:-http://localhost:11434} - OLLAMA_URL=${OLLAMA_URL:-http://localhost:11434}
- CHROMADB_PATH=/app/chromadb - CHROMADB_PATH=/app/data/chromadb
- OPENAI_API_KEY=${OPENAI_API_KEY} - OPENAI_API_KEY=${OPENAI_API_KEY}
- JWT_SECRET_KEY=${JWT_SECRET_KEY} - JWT_SECRET_KEY=${JWT_SECRET_KEY}
- OIDC_ISSUER=${OIDC_ISSUER} - OIDC_ISSUER=${OIDC_ISSUER}
@@ -35,11 +38,12 @@ services:
- OIDC_REDIRECT_URI=${OIDC_REDIRECT_URI} - OIDC_REDIRECT_URI=${OIDC_REDIRECT_URI}
- OIDC_USE_DISCOVERY=${OIDC_USE_DISCOVERY:-true} - OIDC_USE_DISCOVERY=${OIDC_USE_DISCOVERY:-true}
- DATABASE_URL=${DATABASE_URL:-postgres://raggr:changeme@postgres:5432/raggr} - DATABASE_URL=${DATABASE_URL:-postgres://raggr:changeme@postgres:5432/raggr}
- TAVILY_KEY=${TAVILIY_KEY}
depends_on: depends_on:
postgres: postgres:
condition: service_healthy condition: service_healthy
volumes: volumes:
- chromadb_data:/app/chromadb - chromadb_data:/app/data/chromadb
restart: unless-stopped restart: unless-stopped
volumes: volumes:

53
docs/TASKS.md Normal file
View File

@@ -0,0 +1,53 @@
# Tasks & Feature Requests
## Feature Requests
### YNAB Integration (Admin-Only)
- **Description**: Integration with YNAB (You Need A Budget) API to enable financial data queries and insights
- **Requirements**:
- Admin-guarded endpoint (requires `lldap_admin` group)
- YNAB API token configuration in environment variables
- Sync budget data, transactions, and categories
- Store YNAB data for RAG queries
- **Endpoints**:
- `POST /api/admin/ynab/sync` - Trigger YNAB data sync
- `GET /api/admin/ynab/status` - Check sync status and last update
- `GET /api/admin/ynab/budgets` - List available budgets
- **Implementation Notes**:
- Use YNAB API v1 (https://api.youneedabudget.com/v1)
- Consider rate limiting (200 requests per hour)
- Store transaction data in PostgreSQL with appropriate indexing
- Index transaction descriptions and categories in ChromaDB for RAG queries
### Money Insights
- **Description**: AI-powered financial insights and analysis based on YNAB data
- **Features**:
- Spending pattern analysis
- Budget vs. actual comparisons
- Category-based spending trends
- Anomaly detection (unusual transactions)
- Natural language queries like "How much did I spend on groceries last month?"
- Month-over-month and year-over-year comparisons
- **Implementation Notes**:
- Leverage existing LangChain agent architecture
- Add custom tools for financial calculations
- Use LLM to generate insights and summaries
- Create visualizations or data exports for frontend display
## Backlog
- [ ] YNAB API client module
- [ ] YNAB data models (Budget, Transaction, Category, Account)
- [ ] Database schema for financial data
- [ ] YNAB sync background job/scheduler
- [ ] Financial insights LangChain tools
- [ ] Admin UI for YNAB configuration
- [ ] Frontend components for money insights display
## Technical Debt
_To be added_
## Bugs
_To be added_

97
docs/VECTORSTORE.md Normal file
View File

@@ -0,0 +1,97 @@
# Vector Store Management
This document describes how to manage the ChromaDB vector store used for RAG (Retrieval-Augmented Generation).
## Configuration
The vector store location is controlled by the `CHROMADB_PATH` environment variable:
- **Development (local)**: Set in `.env` to a local path (e.g., `/path/to/chromadb`)
- **Docker**: Automatically set to `/app/data/chromadb` and persisted via Docker volume
## Management Commands
### CLI (Command Line)
Use the `scripts/manage_vectorstore.py` script for vector store operations:
```bash
# Show statistics
python scripts/manage_vectorstore.py stats
# Index documents from Paperless-NGX (incremental)
python scripts/manage_vectorstore.py index
# Clear and reindex all documents
python scripts/manage_vectorstore.py reindex
# List documents
python scripts/manage_vectorstore.py list 10
python scripts/manage_vectorstore.py list 20 --show-content
```
### Docker
Run commands inside the Docker container:
```bash
# Show statistics
docker compose exec raggr python scripts/manage_vectorstore.py stats
# Reindex all documents
docker compose exec raggr python scripts/manage_vectorstore.py reindex
```
### API Endpoints
The following authenticated endpoints are available:
- `GET /api/rag/stats` - Get vector store statistics
- `POST /api/rag/index` - Trigger indexing of new documents
- `POST /api/rag/reindex` - Clear and reindex all documents
## How It Works
1. **Document Fetching**: Documents are fetched from Paperless-NGX via the API
2. **Chunking**: Documents are split into chunks of ~1000 characters with 200 character overlap
3. **Embedding**: Chunks are embedded using OpenAI's `text-embedding-3-large` model
4. **Storage**: Embeddings are stored in ChromaDB with metadata (filename, document type, date)
5. **Retrieval**: User queries are embedded and similar chunks are retrieved for RAG
## Troubleshooting
### "Error creating hnsw segment reader"
This indicates a corrupted index. Solution:
```bash
python scripts/manage_vectorstore.py reindex
```
### Empty results
Check if documents are indexed:
```bash
python scripts/manage_vectorstore.py stats
```
If count is 0, run:
```bash
python scripts/manage_vectorstore.py index
```
### Different results in Docker vs local
Docker and local environments use separate ChromaDB instances. To sync:
1. Index inside Docker: `docker compose exec raggr python scripts/manage_vectorstore.py reindex`
2. Or mount the same volume for both environments
## Production Considerations
1. **Volume Persistence**: Use Docker volumes or persistent storage for ChromaDB
2. **Backup**: Regularly backup the ChromaDB data directory
3. **Reindexing**: Schedule periodic reindexing to keep data fresh
4. **Monitoring**: Monitor the `/api/rag/stats` endpoint for document counts

274
docs/authentication.md Normal file
View File

@@ -0,0 +1,274 @@
# Authentication Architecture
This document describes the authentication stack for SimbaRAG: LLDAP → Authelia → OAuth2/OIDC.
## Overview
```
┌─────────┐ ┌──────────┐ ┌──────────────┐ ┌──────────┐
│ LLDAP │────▶│ Authelia │────▶│ OAuth2/OIDC │────▶│ SimbaRAG │
│ (Users) │ │ (IdP) │ │ (Flow) │ │ (App) │
└─────────┘ └──────────┘ └──────────────┘ └──────────┘
```
| Component | Role |
|-----------|------|
| **LLDAP** | Lightweight LDAP server storing users and groups |
| **Authelia** | Identity provider that authenticates against LLDAP and issues OIDC tokens |
| **SimbaRAG** | Relying party that consumes OIDC tokens and manages sessions |
## OIDC Configuration
### Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `OIDC_ISSUER` | Authelia server URL | Required |
| `OIDC_CLIENT_ID` | Client ID registered in Authelia | Required |
| `OIDC_CLIENT_SECRET` | Client secret for token exchange | Required |
| `OIDC_REDIRECT_URI` | Callback URL after authentication | Required |
| `OIDC_USE_DISCOVERY` | Enable automatic discovery | `true` |
| `JWT_SECRET_KEY` | Secret for signing backend JWTs | Required |
### Discovery
When `OIDC_USE_DISCOVERY=true`, the application fetches endpoints from:
```
{OIDC_ISSUER}/.well-known/openid-configuration
```
This provides:
- Authorization endpoint
- Token endpoint
- JWKS URI for signature verification
- Supported scopes and claims
## Authentication Flow
### 1. Login Initiation
```
GET /api/user/oidc/login
```
1. Generate PKCE code verifier and challenge (S256)
2. Generate CSRF state token
3. Store state in session storage
4. Return authorization URL for frontend redirect
### 2. Authorization
User is redirected to Authelia where they:
1. Enter LDAP credentials
2. Complete MFA if configured
3. Consent to requested scopes
### 3. Callback
```
GET /api/user/oidc/callback?code=...&state=...
```
1. Validate state matches stored value (CSRF protection)
2. Exchange authorization code for tokens using PKCE verifier
3. Verify ID token signature using JWKS
4. Validate claims (issuer, audience, expiration)
5. Create or update user in database
6. Issue backend JWT tokens (access + refresh)
### 4. Token Refresh
```
POST /api/user/refresh
Authorization: Bearer <refresh_token>
```
Issues a new access token without re-authentication.
## User Model
```python
class User(Model):
id = UUIDField(primary_key=True)
username = CharField(max_length=255)
password = BinaryField(null=True) # Nullable for OIDC-only users
email = CharField(max_length=100, unique=True)
# OIDC fields
oidc_subject = CharField(max_length=255, unique=True, null=True)
auth_provider = CharField(max_length=50, default="local") # "local" or "oidc"
ldap_groups = JSONField(default=[]) # LDAP groups from OIDC claims
created_at = DatetimeField(auto_now_add=True)
updated_at = DatetimeField(auto_now=True)
def has_group(self, group: str) -> bool:
"""Check if user belongs to a specific LDAP group."""
return group in (self.ldap_groups or [])
def is_admin(self) -> bool:
"""Check if user is an admin (member of lldap_admin group)."""
return self.has_group("lldap_admin")
```
### User Provisioning
The `OIDCUserService` handles automatic user creation:
1. Extract claims from ID token (`sub`, `email`, `preferred_username`)
2. Check if user exists by `oidc_subject`
3. If not, check by email for migration from local auth
4. Create new user or update existing
## JWT Tokens
Backend issues its own JWTs after OIDC authentication:
| Token Type | Purpose | Typical Lifetime |
|------------|---------|------------------|
| Access Token | API authorization | 15 minutes |
| Refresh Token | Obtain new access tokens | 7 days |
### Claims
```json
{
"identity": "<user-uuid>",
"type": "access|refresh",
"exp": 1234567890,
"iat": 1234567890
}
```
## Protected Endpoints
All API endpoints use the `@jwt_refresh_token_required` decorator for basic authentication:
```python
@blueprint.route("/example")
@jwt_refresh_token_required
async def protected_endpoint():
user_id = get_jwt_identity()
# ...
```
---
## Role-Based Access Control (RBAC)
RBAC is implemented using LDAP groups passed through Authelia as OIDC claims. Users in the `lldap_admin` group have admin privileges.
### Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ LLDAP │
│ Groups: lldap_admin, lldap_user │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Authelia │
│ Scope: groups → Claim: groups = ["lldap_admin"] │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ SimbaRAG │
│ 1. Extract groups from ID token │
│ 2. Store in User.ldap_groups │
│ 3. Check membership with @admin_required decorator │
└─────────────────────────────────────────────────────────────┘
```
### Authelia Configuration
Ensure Authelia is configured to pass the `groups` claim:
```yaml
identity_providers:
oidc:
clients:
- client_id: simbarag
scopes:
- openid
- profile
- email
- groups # Required for RBAC
```
### Admin-Only Endpoints
The `@admin_required` decorator protects privileged endpoints:
```python
from blueprints.users.decorators import admin_required
@blueprint.post("/admin-action")
@admin_required
async def admin_only_endpoint():
# Only users in lldap_admin group can access
...
```
**Protected endpoints:**
| Endpoint | Access | Description |
|----------|--------|-------------|
| `POST /api/rag/index` | Admin | Trigger document indexing |
| `POST /api/rag/reindex` | Admin | Clear and reindex all documents |
| `GET /api/rag/stats` | All users | View vector store statistics |
### User Response
The OIDC callback returns group information:
```json
{
"access_token": "...",
"refresh_token": "...",
"user": {
"id": "uuid",
"username": "john",
"email": "john@example.com",
"groups": ["lldap_admin", "lldap_user"],
"is_admin": true
}
}
```
---
## Security Considerations
### Current Gaps
| Issue | Risk | Mitigation |
|-------|------|------------|
| In-memory session storage | State lost on restart, not scalable | Use Redis for production |
| No token revocation | Tokens valid until expiry | Implement blacklist or short expiry |
| No audit logging | Cannot track auth events | Add event logging |
| Single JWT secret | Compromise affects all tokens | Rotate secrets, use asymmetric keys |
### Recommendations
1. **Use Redis** for OIDC state storage in production
2. **Implement logout** with token blacklisting
3. **Add audit logging** for authentication events
4. **Rotate JWT secrets** regularly
5. **Use short-lived access tokens** (15 min) with refresh
---
## File Reference
| File | Purpose |
|------|---------|
| `services/raggr/oidc_config.py` | OIDC client configuration and discovery |
| `services/raggr/blueprints/users/models.py` | User model definition with group helpers |
| `services/raggr/blueprints/users/oidc_service.py` | User provisioning from OIDC claims |
| `services/raggr/blueprints/users/__init__.py` | Auth endpoints and flow |
| `services/raggr/blueprints/users/decorators.py` | Auth decorators (`@admin_required`) |

188
docs/deployment.md Normal file
View File

@@ -0,0 +1,188 @@
# Deployment & Migrations Guide
This document covers database migrations and deployment workflows for SimbaRAG.
## Migration Workflow
Migrations are managed by [Aerich](https://github.com/tortoise/aerich), the migration tool for Tortoise ORM.
### Key Principles
1. **Generate migrations in Docker** - Aerich needs database access to detect schema changes
2. **Migrations auto-apply on startup** - Both `startup.sh` and `startup-dev.sh` run `aerich upgrade`
3. **Commit migrations to git** - Migration files must be in the repo for production deploys
### Generating a New Migration
#### Development (Recommended)
With `docker-compose.dev.yml`, your local `services/raggr` directory is synced to the container. Migrations generated inside the container appear on your host automatically.
```bash
# 1. Start the dev environment
docker compose -f docker-compose.dev.yml up -d
# 2. Generate migration (runs inside container, syncs to host)
docker compose -f docker-compose.dev.yml exec raggr aerich migrate --name describe_your_change
# 3. Verify migration was created
ls services/raggr/migrations/models/
# 4. Commit the migration
git add services/raggr/migrations/
git commit -m "Add migration: describe_your_change"
```
#### Production Container
For production, migration files are baked into the image. You must generate migrations in dev first.
```bash
# If you need to generate a migration from production (not recommended):
docker compose exec raggr aerich migrate --name describe_your_change
# Copy the file out of the container
docker cp $(docker compose ps -q raggr):/app/migrations/models/ ./services/raggr/migrations/
```
### Applying Migrations
Migrations apply automatically on container start via the startup scripts.
**Manual application (if needed):**
```bash
# Dev
docker compose -f docker-compose.dev.yml exec raggr aerich upgrade
# Production
docker compose exec raggr aerich upgrade
```
### Checking Migration Status
```bash
# View applied migrations
docker compose exec raggr aerich history
# View pending migrations
docker compose exec raggr aerich heads
```
### Rolling Back
```bash
# Downgrade one migration
docker compose exec raggr aerich downgrade
# Downgrade to specific version
docker compose exec raggr aerich downgrade -v 1
```
## Deployment Workflows
### Development
```bash
# Start with watch mode (auto-restarts on file changes)
docker compose -f docker-compose.dev.yml up
# Or with docker compose watch (requires Docker Compose v2.22+)
docker compose -f docker-compose.dev.yml watch
```
The dev environment:
- Syncs `services/raggr/` to `/app` in the container
- Rebuilds frontend on changes
- Auto-applies migrations on startup
### Production
```bash
# Build and deploy
docker compose build raggr
docker compose up -d
# View logs
docker compose logs -f raggr
# Verify migrations applied
docker compose exec raggr aerich history
```
### Fresh Deploy (New Database)
On first deploy with an empty database, `startup-dev.sh` runs `aerich init-db` instead of `aerich upgrade`. This creates all tables from the current models.
For production (`startup.sh`), ensure the database exists and run:
```bash
# If aerich table doesn't exist yet
docker compose exec raggr aerich init-db
# Or if migrating from existing schema
docker compose exec raggr aerich upgrade
```
## Troubleshooting
### "No migrations found" on startup
The `migrations/models/` directory is empty or not copied into the image.
**Fix:** Ensure migrations are committed and the Dockerfile copies them:
```dockerfile
COPY migrations ./migrations
```
### Migration fails with "relation already exists"
The database has tables but aerich doesn't know about them (fresh aerich setup on existing DB).
**Fix:** Fake the initial migration:
```bash
# Mark initial migration as applied without running it
docker compose exec raggr aerich upgrade --fake
```
### Model changes not detected
Aerich compares models against the last migration's state. If state is out of sync:
```bash
# Regenerate migration state (dangerous - review carefully)
docker compose exec raggr aerich migrate --name fix_state
```
### Database connection errors
Ensure PostgreSQL is healthy before running migrations:
```bash
# Check postgres status
docker compose ps postgres
# Wait for postgres then run migrations
docker compose exec raggr bash -c "sleep 5 && aerich upgrade"
```
## File Reference
| File | Purpose |
|------|---------|
| `pyproject.toml` | Aerich config (`[tool.aerich]` section) |
| `migrations/models/` | Migration files |
| `startup.sh` | Production startup (runs `aerich upgrade`) |
| `startup-dev.sh` | Dev startup (runs `aerich upgrade` or `init-db`) |
| `app.py` | Contains `TORTOISE_CONFIG` |
| `aerich_config.py` | Aerich initialization configuration |
## Quick Reference
| Task | Command |
|------|---------|
| Generate migration | `docker compose -f docker-compose.dev.yml exec raggr aerich migrate --name name` |
| Apply migrations | `docker compose exec raggr aerich upgrade` |
| View history | `docker compose exec raggr aerich history` |
| Rollback | `docker compose exec raggr aerich downgrade` |
| Fresh init | `docker compose exec raggr aerich init-db` |

258
docs/development.md Normal file
View File

@@ -0,0 +1,258 @@
# Development Guide
This guide explains how to run SimbaRAG in development mode.
## Quick Start
### Option 1: Local Development (Recommended)
Run PostgreSQL in Docker and the application locally for faster iteration:
```bash
# 1. Start PostgreSQL
docker compose -f docker-compose.dev.yml up -d
# 2. Set environment variables
export DATABASE_URL="postgres://raggr:raggr_dev_password@localhost:5432/raggr"
export CHROMADB_PATH="./chromadb"
export $(grep -v '^#' .env | xargs) # Load other vars from .env
# 3. Install dependencies (first time)
pip install -r requirements.txt
cd raggr-frontend && yarn install && yarn build && cd ..
# 4. Run migrations
aerich upgrade
# 5. Start the server
python app.py
```
The application will be available at `http://localhost:8080`.
### Option 2: Full Docker Development
Run everything in Docker with hot reload (slower, but matches production):
```bash
# Uncomment the raggr service in docker-compose.dev.yml first!
# Start all services
docker compose -f docker-compose.dev.yml up --build
# View logs
docker compose -f docker-compose.dev.yml logs -f raggr
```
## Project Structure
```
raggr/
├── app.py # Quart application entry point
├── main.py # RAG logic and LangChain agent
├── llm.py # LLM client (Ollama + OpenAI fallback)
├── aerich_config.py # Database migration configuration
├── blueprints/ # API route blueprints
│ ├── users/ # Authentication (OIDC, JWT, RBAC)
│ ├── conversation/ # Chat conversations and messages
│ └── rag/ # Document indexing (admin only)
├── config/ # Configuration modules
│ └── oidc_config.py # OIDC authentication settings
├── utils/ # Reusable utilities
│ ├── chunker.py # Document chunking for embeddings
│ ├── cleaner.py # PDF cleaning and summarization
│ ├── image_process.py # Image description with LLM
│ └── request.py # Paperless-NGX API client
├── scripts/ # Administrative scripts
│ ├── add_user.py # Create users manually
│ ├── user_message_stats.py # User message statistics
│ ├── manage_vectorstore.py # Vector store management
│ ├── inspect_vector_store.py # Inspect ChromaDB contents
│ └── query.py # Query generation utilities
├── raggr-frontend/ # React frontend
│ └── src/ # Frontend source code
├── migrations/ # Database migrations
└── docs/ # Documentation
```
## Making Changes
### Backend Changes
**Local development:**
1. Edit Python files
2. Save
3. Restart `python app.py` (or use a tool like `watchdog` for auto-reload)
**Docker development:**
1. Edit Python files
2. Files are synced via Docker watch mode
3. Container automatically restarts
### Frontend Changes
```bash
cd raggr-frontend
# Development mode with hot reload
yarn dev
# Production build (for testing)
yarn build
```
The backend serves built files from `raggr-frontend/dist/`.
### Database Model Changes
When you modify Tortoise ORM models:
```bash
# Generate migration
aerich migrate --name "describe_your_change"
# Apply migration
aerich upgrade
# View history
aerich history
```
See [deployment.md](deployment.md) for detailed migration workflows.
### Adding Dependencies
**Backend:**
```bash
# Add to requirements.txt or use uv
pip install package-name
pip freeze > requirements.txt
```
**Frontend:**
```bash
cd raggr-frontend
yarn add package-name
```
## Useful Commands
### Database
```bash
# Connect to PostgreSQL
docker compose -f docker-compose.dev.yml exec postgres psql -U raggr -d raggr
# Reset database
docker compose -f docker-compose.dev.yml down -v
docker compose -f docker-compose.dev.yml up -d
aerich init-db
```
### Vector Store
```bash
# Show statistics
python scripts/manage_vectorstore.py stats
# Index new documents from Paperless
python scripts/manage_vectorstore.py index
# Clear and reindex everything
python scripts/manage_vectorstore.py reindex
```
See [vectorstore.md](vectorstore.md) for details.
### Scripts
```bash
# Add a new user
python scripts/add_user.py
# View message statistics
python scripts/user_message_stats.py
# Inspect vector store contents
python scripts/inspect_vector_store.py
```
## Environment Variables
Copy `.env.example` to `.env` and configure:
| Variable | Description | Example |
|----------|-------------|---------|
| `DATABASE_URL` | PostgreSQL connection | `postgres://user:pass@localhost:5432/db` |
| `CHROMADB_PATH` | ChromaDB storage path | `./chromadb` |
| `OLLAMA_URL` | Ollama server URL | `http://localhost:11434` |
| `OPENAI_API_KEY` | OpenAI API key (fallback LLM) | `sk-...` |
| `PAPERLESS_TOKEN` | Paperless-NGX API token | `...` |
| `BASE_URL` | Paperless-NGX URL | `https://paperless.example.com` |
| `OIDC_ISSUER` | OIDC provider URL | `https://auth.example.com` |
| `OIDC_CLIENT_ID` | OIDC client ID | `simbarag` |
| `OIDC_CLIENT_SECRET` | OIDC client secret | `...` |
| `JWT_SECRET_KEY` | JWT signing key | `random-secret` |
| `TAVILY_KEY` | Tavily web search API key | `tvly-...` |
## Troubleshooting
### Port Already in Use
```bash
# Find and kill process on port 8080
lsof -ti:8080 | xargs kill -9
# Or change the port in app.py
```
### Database Connection Errors
```bash
# Check if PostgreSQL is running
docker compose -f docker-compose.dev.yml ps postgres
# View PostgreSQL logs
docker compose -f docker-compose.dev.yml logs postgres
```
### Frontend Not Building
```bash
cd raggr-frontend
rm -rf node_modules dist
yarn install
yarn build
```
### ChromaDB Errors
```bash
# Clear and recreate ChromaDB
rm -rf chromadb/
python scripts/manage_vectorstore.py reindex
```
### Import Errors After Reorganization
Ensure you're in the project root directory when running scripts, or use:
```bash
# Add project root to Python path
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
python scripts/your_script.py
```
## Hot Tips
- Use `python -m pdb app.py` for debugging
- Enable Quart debug mode in `app.py`: `app.run(debug=True)`
- Check API logs: They appear in the terminal running `python app.py`
- Frontend logs: Open browser DevTools console
- Use `docker compose -f docker-compose.dev.yml down -v` for a clean slate

203
docs/index.md Normal file
View File

@@ -0,0 +1,203 @@
# SimbaRAG Documentation
Welcome to the SimbaRAG documentation! This guide will help you understand, develop, and deploy the SimbaRAG conversational AI system.
## Getting Started
New to SimbaRAG? Start here:
1. Read the main [README](../README.md) for project overview and architecture
2. Follow the [Development Guide](development.md) to set up your environment
3. Learn about [Authentication](authentication.md) setup with OIDC and LDAP
## Documentation Structure
### Core Guides
- **[Development Guide](development.md)** - Local development setup, project structure, and workflows
- **[Deployment Guide](deployment.md)** - Database migrations, deployment workflows, and troubleshooting
- **[Vector Store Guide](VECTORSTORE.md)** - Managing ChromaDB, indexing documents, and RAG operations
- **[Migrations Guide](MIGRATIONS.md)** - Database migration reference
- **[Authentication Guide](authentication.md)** - OIDC, Authelia, LLDAP configuration and user management
### Quick Reference
| Task | Documentation |
|------|---------------|
| Set up local dev environment | [Development Guide → Quick Start](development.md#quick-start) |
| Run database migrations | [Deployment Guide → Migration Workflow](deployment.md#migration-workflow) |
| Index documents | [Vector Store Guide → Management Commands](VECTORSTORE.md#management-commands) |
| Configure authentication | [Authentication Guide](authentication.md) |
| Run administrative scripts | [Development Guide → Scripts](development.md#scripts) |
## Common Tasks
### Development
```bash
# Start local development
docker compose -f docker-compose.dev.yml up -d
export DATABASE_URL="postgres://raggr:raggr_dev_password@localhost:5432/raggr"
export CHROMADB_PATH="./chromadb"
python app.py
```
### Database Migrations
```bash
# Generate migration
aerich migrate --name "your_change"
# Apply migrations
aerich upgrade
# View history
aerich history
```
### Vector Store Management
```bash
# Show statistics
python scripts/manage_vectorstore.py stats
# Index new documents
python scripts/manage_vectorstore.py index
# Reindex everything
python scripts/manage_vectorstore.py reindex
```
## Architecture Overview
SimbaRAG is built with:
- **Backend**: Quart (async Python), LangChain, Tortoise ORM
- **Frontend**: React 19, Rsbuild, Tailwind CSS
- **Database**: PostgreSQL (users, conversations)
- **Vector Store**: ChromaDB (document embeddings)
- **LLM**: Ollama (primary), OpenAI (fallback)
- **Auth**: Authelia (OIDC), LLDAP (user directory)
See the [README](../README.md#system-architecture) for detailed architecture diagram.
## Project Structure
```
simbarag/
├── app.py # Quart app entry point
├── main.py # RAG & LangChain agent
├── llm.py # LLM client
├── blueprints/ # API routes
├── config/ # Configuration
├── utils/ # Utilities
├── scripts/ # Admin scripts
├── raggr-frontend/ # React UI
├── migrations/ # Database migrations
├── docs/ # This documentation
├── docker-compose.yml # Production Docker setup
└── docker-compose.dev.yml # Development Docker setup
```
## Key Concepts
### RAG (Retrieval-Augmented Generation)
SimbaRAG uses RAG to answer questions about Simba:
1. Documents are fetched from Paperless-NGX
2. Documents are chunked and embedded using OpenAI
3. Embeddings are stored in ChromaDB
4. User queries are embedded and matched against the store
5. Relevant chunks are passed to the LLM for context
6. LLM generates an answer using retrieved context
### LangChain Agent
The conversational agent has two tools:
- **simba_search**: Queries the vector store for Simba's documents
- **web_search**: Searches the web via Tavily API
The agent automatically selects tools based on the query.
### Authentication Flow
1. User initiates OIDC login via Authelia
2. Authelia authenticates against LLDAP
3. Backend receives OIDC tokens and issues JWT
4. Frontend stores JWT in localStorage
5. Subsequent requests use JWT for authorization
## Environment Variables
Key environment variables (see `.env.example` for complete list):
| Variable | Purpose |
|----------|---------|
| `DATABASE_URL` | PostgreSQL connection |
| `CHROMADB_PATH` | Vector store location |
| `OLLAMA_URL` | Local LLM server |
| `OPENAI_API_KEY` | OpenAI for embeddings/fallback |
| `PAPERLESS_TOKEN` | Document source API |
| `OIDC_*` | Authentication configuration |
| `TAVILY_KEY` | Web search API |
## API Endpoints
### Authentication
- `GET /api/user/oidc/login` - Start OIDC flow
- `GET /api/user/oidc/callback` - OIDC callback
- `POST /api/user/refresh` - Refresh JWT
### Conversations
- `POST /api/conversation/` - Create conversation
- `GET /api/conversation/` - List conversations
- `POST /api/conversation/query` - Chat message
### RAG (Admin Only)
- `GET /api/rag/stats` - Vector store stats
- `POST /api/rag/index` - Index documents
- `POST /api/rag/reindex` - Reindex all
## Troubleshooting
### Common Issues
| Issue | Solution |
|-------|----------|
| Port already in use | Check if services are running: `lsof -ti:8080` |
| Database connection error | Ensure PostgreSQL is running: `docker compose ps` |
| ChromaDB errors | Clear and reindex: `python scripts/manage_vectorstore.py reindex` |
| Import errors | Check you're in `services/raggr/` directory |
| Frontend not building | `cd raggr-frontend && yarn install && yarn build` |
See individual guides for detailed troubleshooting.
## Contributing
1. Read the [Development Guide](development.md)
2. Set up your local environment
3. Make changes and test locally
4. Generate migrations if needed
5. Submit a pull request
## Additional Resources
- [LangChain Documentation](https://python.langchain.com/)
- [ChromaDB Documentation](https://docs.trychroma.com/)
- [Quart Documentation](https://quart.palletsprojects.com/)
- [Tortoise ORM Documentation](https://tortoise.github.io/)
- [Authelia Documentation](https://www.authelia.com/)
## Need Help?
- Check the relevant guide in this documentation
- Review troubleshooting sections
- Check application logs: `docker compose logs -f`
- Inspect database: `docker compose exec postgres psql -U raggr`
---
**Documentation Version**: 1.0
**Last Updated**: January 2026

81
index.html Normal file
View File

@@ -0,0 +1,81 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta name="author" content="Paperless-ngx project and contributors">
<meta name="robots" content="noindex,nofollow">
<title>
Paperless-ngx sign in
</title>
<link href="/static/bootstrap.min.css" rel="stylesheet">
<link href="/static/base.css" rel="stylesheet">
</head>
<body class="text-center">
<div class="position-absolute top-50 start-50 translate-middle">
<form class="form-accounts" id="form-account" method="post">
<input type="hidden" name="csrfmiddlewaretoken" value="KLQ3mMraTFHfK9sMmc6DJcNIS6YixeHnSJiT3A12LYB49HeEXOpx5RnY9V6uPSrD">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 2897.4 896.6" width='300' class='logo mb-4'>
<path class="leaf" d="M140,713.7c-3.4-16.4-10.3-49.1-11.2-49.1c-145.7-87.1-128.4-238-80.2-324.2C59,449,251.2,524,139.1,656.8 c-0.9,1.7,5.2,22.4,10.3,41.4c22.4-37.9,56-83.6,54.3-87.9C65.9,273.9,496.9,248.1,586.6,39.4c40.5,201.8-20.7,513.9-367.2,593.2 c-1.7,0.9-62.9,108.6-65.5,109.5c0-1.7-25.9-0.9-22.4-9.5C133.1,727.4,136.6,720.6,140,713.7L140,713.7z M135.7,632.6 c44-50.9-7.8-137.9-38.8-166.4C149.5,556.7,146,609.3,135.7,632.6L135.7,632.6z" transform="translate(0)" style="fill:#17541f"/>
<g class="text" style="fill:#000">
<path d="M1022.3,428.7c-17.8-19.9-42.7-29.8-74.7-29.8c-22.3,0-42.4,5.7-60.5,17.3c-18.1,11.6-32.3,27.5-42.5,47.8 s-15.3,42.9-15.3,67.8c0,24.9,5.1,47.5,15.3,67.8c10.3,20.3,24.4,36.2,42.5,47.8c18.1,11.5,38.3,17.3,60.5,17.3 c32,0,56.9-9.9,74.7-29.8v20.4v0.2h84.5V408.3h-84.5V428.7z M1010.5,575c-10.2,11.7-23.6,17.6-40.2,17.6s-29.9-5.9-40-17.6 s-15.1-26.1-15.1-43.3c0-17.1,5-31.6,15.1-43.3s23.4-17.6,40-17.6c16.6,0,30,5.9,40.2,17.6s15.3,26.1,15.3,43.3 S1020.7,563.3,1010.5,575z" transform="translate(0)"/>
<path d="M1381,416.1c-18.1-11.5-38.3-17.3-60.5-17.4c-32,0-56.9,9.9-74.7,29.8v-20.4h-84.5v390.7h84.5v-164 c17.8,19.9,42.7,29.8,74.7,29.8c22.3,0,42.4-5.7,60.5-17.3s32.3-27.5,42.5-47.8c10.2-20.3,15.3-42.9,15.3-67.8s-5.1-47.5-15.3-67.8 C1413.2,443.6,1399.1,427.7,1381,416.1z M1337.9,575c-10.1,11.7-23.4,17.6-40,17.6s-29.9-5.9-40-17.6s-15.1-26.1-15.1-43.3 c0-17.1,5-31.6,15.1-43.3s23.4-17.6,40-17.6s29.9,5.9,40,17.6s15.1,26.1,15.1,43.3S1347.9,563.3,1337.9,575z" transform="translate(0)"/>
<path d="M1672.2,416.8c-20.5-12-43-18-67.6-18c-24.9,0-47.6,5.9-68,17.6c-20.4,11.7-36.5,27.7-48.2,48s-17.6,42.7-17.6,67.3 c0.3,25.2,6.2,47.8,17.8,68c11.5,20.2,28,36,49.3,47.6c21.3,11.5,45.9,17.3,73.8,17.3c48.6,0,86.8-14.7,114.7-44l-52.5-48.9 c-8.6,8.3-17.6,14.6-26.7,19c-9.3,4.3-21.1,6.4-35.3,6.4c-11.6,0-22.5-3.6-32.7-10.9c-10.3-7.3-17.1-16.5-20.7-27.8h180l0.4-11.6 c0-29.6-6-55.7-18-78.2S1692.6,428.8,1672.2,416.8z M1558.3,503.2c2.1-12.1,7.5-21.8,16.2-29.1s18.7-10.9,30-10.9 s21.2,3.6,29.8,10.9c8.6,7.2,13.9,16.9,16,29.1H1558.3z" transform="translate(0)"/>
<path d="M1895.3,411.7c-11,5.6-20.3,13.7-28,24.4h-0.1v-28h-84.5v247.3h84.5V536.3c0-22.6,4.7-38.1,14.2-46.5 c9.5-8.5,22.7-12.7,39.6-12.7c6.2,0,13.5,1,21.8,3.1l10.7-72c-5.9-3.3-14.5-4.9-25.8-4.9C1917.1,403.3,1906.3,406.1,1895.3,411.7z" transform="translate(0)"/>
<rect x="1985" y="277.4" width="84.5" height="377.8" transform="translate(0)"/>
<path d="M2313.2,416.8c-20.5-12-43-18-67.6-18c-24.9,0-47.6,5.9-68,17.6s-36.5,27.7-48.2,48c-11.7,20.3-17.6,42.7-17.6,67.3 c0.3,25.2,6.2,47.8,17.8,68c11.5,20.2,28,36,49.3,47.6c21.3,11.5,45.9,17.3,73.8,17.3c48.6,0,86.8-14.7,114.7-44l-52.5-48.9 c-8.6,8.3-17.6,14.6-26.7,19c-9.3,4.3-21.1,6.4-35.3,6.4c-11.6,0-22.5-3.6-32.7-10.9c-10.3-7.3-17.1-16.5-20.7-27.8h180l0.4-11.6 c0-29.6-6-55.7-18-78.2S2333.6,428.8,2313.2,416.8z M2199.3,503.2c2.1-12.1,7.5-21.8,16.2-29.1s18.7-10.9,30-10.9 s21.2,3.6,29.8,10.9c8.6,7.2,13.9,16.9,16,29.1H2199.3z" transform="translate(0)"/>
<path d="M2583.6,507.7c-13.8-4.4-30.6-8.1-50.5-11.1c-15.1-2.7-26.1-5.2-32.9-7.6c-6.8-2.4-10.2-6.1-10.2-11.1s2.3-8.7,6.7-10.9 c4.4-2.2,11.5-3.3,21.3-3.3c11.6,0,24.3,2.4,38.1,7.2c13.9,4.8,26.2,11,36.9,18.4l32.4-58.2c-11.3-7.4-26.2-14.7-44.9-21.8 c-18.7-7.1-39.6-10.7-62.7-10.7c-33.7,0-60.2,7.6-79.3,22.7c-19.1,15.1-28.7,36.1-28.7,63.1c0,19,4.8,33.9,14.4,44.7 c9.6,10.8,21,18.5,34,22.9c13.1,4.5,28.9,8.3,47.6,11.6c14.6,2.7,25.1,5.3,31.6,7.8s9.8,6.5,9.8,11.8c0,10.4-9.7,15.6-29.3,15.6 c-13.7,0-28.5-2.3-44.7-6.9c-16.1-4.6-29.2-11.3-39.3-20.2l-33.3,60c9.2,7.4,24.6,14.7,46.2,22c21.7,7.3,45.2,10.9,70.7,10.9 c34.7,0,62.9-7.4,84.5-22.4c21.7-15,32.5-37.3,32.5-66.9c0-19.3-5-34.2-15.1-44.9S2597.4,512.1,2583.6,507.7z" transform="translate(0)"/>
<path d="M2883.4,575.3c0-19.3-5-34.2-15.1-44.9s-22-18.3-35.8-22.7c-13.8-4.4-30.6-8.1-50.5-11.1c-15.1-2.7-26.1-5.2-32.9-7.6 c-6.8-2.4-10.2-6.1-10.2-11.1s2.3-8.7,6.7-10.9c4.4-2.2,11.5-3.3,21.3-3.3c11.6,0,24.3,2.4,38.1,7.2c13.9,4.8,26.2,11,36.9,18.4 l32.4-58.2c-11.3-7.4-26.2-14.7-44.9-21.8c-18.7-7.1-39.6-10.7-62.7-10.7c-33.7,0-60.2,7.6-79.3,22.7 c-19.1,15.1-28.7,36.1-28.7,63.1c0,19,4.8,33.9,14.4,44.7c9.6,10.8,21,18.5,34,22.9c13.1,4.5,28.9,8.3,47.6,11.6 c14.6,2.7,25.1,5.3,31.6,7.8s9.8,6.5,9.8,11.8c0,10.4-9.7,15.6-29.3,15.6c-13.7,0-28.5-2.3-44.7-6.9c-16.1-4.6-29.2-11.3-39.3-20.2 l-33.3,60c9.2,7.4,24.6,14.7,46.2,22c21.7,7.3,45.2,10.9,70.7,10.9c34.7,0,62.9-7.4,84.5-22.4 C2872.6,627.2,2883.4,604.9,2883.4,575.3z" transform="translate(0)"/>
<rect x="2460.7" y="738.7" width="59.6" height="17.2" transform="translate(0)"/>
<path d="M2596.5,706.4c-5.7,0-11,1-15.8,3s-9,5-12.5,8.9v-9.4h-19.4v93.6h19.4v-52c0-8.6,2.1-15.3,6.3-20c4.2-4.7,9.5-7.1,15.9-7.1 c7.8,0,13.4,2.3,16.8,6.7c3.4,4.5,5.1,11.3,5.1,20.5v52h19.4v-56.8c0-12.8-3.2-22.6-9.5-29.3 C2615.8,709.8,2607.3,706.4,2596.5,706.4z" transform="translate(0)"/>
<path d="M2733.8,717.7c-3.6-3.4-7.9-6.1-13.1-8.2s-10.6-3.1-16.2-3.1c-8.7,0-16.5,2.1-23.5,6.3s-12.5,10-16.5,17.3 c-4,7.3-6,15.4-6,24.4c0,8.9,2,17.1,6,24.3c4,7.3,9.5,13,16.5,17.2s14.9,6.3,23.5,6.3c5.6,0,11-1,16.2-3.1 c5.1-2.1,9.5-4.8,13.1-8.2v24.4c0,8.5-2.5,14.8-7.6,18.7c-5,3.9-11,5.9-18,5.9c-6.7,0-12.4-1.6-17.3-4.7c-4.8-3.1-7.6-7.7-8.3-13.8 h-19.4c0.6,7.7,2.9,14.2,7.1,19.5s9.6,9.3,16.2,12c6.6,2.7,13.8,4,21.7,4c12.8,0,23.5-3.4,32-10.1c8.6-6.7,12.8-17.1,12.8-31.1 V708.9h-19.2V717.7z M2732.2,770.1c-2.5,4.7-6,8.3-10.4,11.2c-4.4,2.7-9.4,4-14.9,4c-5.7,0-10.8-1.4-15.2-4.3s-7.8-6.7-10.2-11.4 c-2.3-4.8-3.5-9.8-3.5-15.2c0-5.5,1.1-10.6,3.5-15.3s5.8-8.5,10.2-11.3s9.5-4.2,15.2-4.2c5.5,0,10.5,1.4,14.9,4s7.9,6.3,10.4,11 s3.8,10,3.8,15.8S2734.7,765.4,2732.2,770.1z" transform="translate(0)"/>
<polygon points="2867.9,708.9 2846.5,708.9 2820.9,741.9 2795.5,708.9 2773.1,708.9 2809.1,755 2771.5,802.5 2792.9,802.5 2820.1,767.9 2847.2,802.6 2869.6,802.6 2832,754.4 " transform="translate(0)"/>
<path d="M757.6,293.7c-20-10.8-42.6-16.2-67.8-16.2H600c-8.5,39.2-21.1,76.4-37.6,111.3c-9.9,20.8-21.1,40.6-33.6,59.4v207.2h88.9 V521.5h72c25.2,0,47.8-5.4,67.8-16.2s35.7-25.6,47.1-44.2c11.4-18.7,17.1-39.1,17.1-61.3c0.1-22.7-5.6-43.3-17-61.9 C793.3,319.2,777.6,304.5,757.6,293.7z M716.6,434.3c-9.3,8.9-21.6,13.3-36.7,13.3l-62.2,0.4v-92.5l62.2-0.4 c15.1,0,27.3,4.4,36.7,13.3c9.4,8.9,14,19.9,14,32.9C730.6,414.5,726,425.4,716.6,434.3z" transform="translate(0)"/>
</g>
</svg>
<p>
Please sign in.
</p>
<div class="form-floating form-stacked-top">
<input type="text" name="login" id="inputUsername" placeholder="Username" class="form-control" autocorrect="off" autocapitalize="none" required autofocus>
<label for="inputUsername">Username</label>
</div>
<div class="form-floating form-stacked-bottom">
<input type="password" name="password" id="inputPassword" placeholder="Password" class="form-control" required>
<label for="inputPassword">Password</label>
</div>
<div class="d-grid mt-3">
<button class="btn btn-lg btn-primary" type="submit">Sign in</button>
</div>
</form>
</div>
</body>
</html>

46
llm.py Normal file
View File

@@ -0,0 +1,46 @@
import os
import logging
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
logging.basicConfig(level=logging.INFO)
class LLMClient:
def __init__(self):
llama_url = os.getenv("LLAMA_SERVER_URL")
if llama_url:
self.client = OpenAI(base_url=llama_url, api_key="not-needed")
self.model = os.getenv("LLAMA_MODEL_NAME", "llama-3.1-8b-instruct")
self.PROVIDER = "llama_server"
logging.info("Using llama_server as LLM backend")
else:
self.client = OpenAI()
self.model = "gpt-4o-mini"
self.PROVIDER = "openai"
logging.info("Using OpenAI as LLM backend")
def chat(
self,
prompt: str,
system_prompt: str,
):
response = self.client.chat.completions.create(
model=self.model,
messages=[
{
"role": "system",
"content": system_prompt,
},
{"role": "user", "content": prompt},
],
)
return response.choices[0].message.content
if __name__ == "__main__":
client = LLMClient()
print(client.chat(prompt="Hello!", system_prompt="You are a helpful assistant."))

View File

@@ -1,30 +1,20 @@
import argparse
import datetime import datetime
import logging import logging
import os import os
import sqlite3 import sqlite3
import argparse
import chromadb
import ollama
import time import time
from request import PaperlessNGXService
from chunker import Chunker
from cleaner import pdf_to_image, summarize_pdf_image
from llm import LLMClient
from query import QueryGenerator
from dotenv import load_dotenv from dotenv import load_dotenv
_dotenv_loaded = load_dotenv() import chromadb
from utils.chunker import Chunker
from utils.cleaner import pdf_to_image, summarize_pdf_image
from llm import LLMClient
from scripts.query import QueryGenerator
from utils.request import PaperlessNGXService
# Configure ollama client with URL from environment or default to localhost _dotenv_loaded = load_dotenv()
ollama_client = ollama.Client(
host=os.getenv("OLLAMA_URL", "http://localhost:11434"), timeout=10.0
)
client = chromadb.PersistentClient(path=os.getenv("CHROMADB_PATH", "")) client = chromadb.PersistentClient(path=os.getenv("CHROMADB_PATH", ""))
simba_docs = client.get_or_create_collection(name="simba_docs2") simba_docs = client.get_or_create_collection(name="simba_docs2")
@@ -186,7 +176,7 @@ def consult_oracle(
def llm_chat(input: str, transcript: str = "") -> str: def llm_chat(input: str, transcript: str = "") -> str:
system_prompt = "You are a helpful assistant that understands veterinary terms." system_prompt = "You are a helpful assistant that understands veterinary terms."
transcript_prompt = f"Here is the message transcript thus far {transcript}." transcript_prompt = f"Here is the message transcript thus far {transcript}."
prompt = f"""Answer the user in as if you were a cat named Simba. Don't act too catlike. Be assertive. prompt = f"""Answer the user in as if you were a cat named Simba. Don't act too catlike. Be assertive.
{transcript_prompt if len(transcript) > 0 else ""} {transcript_prompt if len(transcript) > 0 else ""}
Respond to this prompt: {input}""" Respond to this prompt: {input}"""
output = llm_client.chat(prompt=prompt, system_prompt=system_prompt) output = llm_client.chat(prompt=prompt, system_prompt=system_prompt)

View File

@@ -0,0 +1,72 @@
from tortoise import BaseDBAsyncClient
RUN_IN_TRANSACTION = True
async def upgrade(db: BaseDBAsyncClient) -> str:
return """
CREATE TABLE IF NOT EXISTS "users" (
"id" UUID NOT NULL PRIMARY KEY,
"username" VARCHAR(255) NOT NULL,
"password" BYTEA,
"email" VARCHAR(100) NOT NULL UNIQUE,
"oidc_subject" VARCHAR(255) UNIQUE,
"auth_provider" VARCHAR(50) NOT NULL DEFAULT 'local',
"ldap_groups" JSONB NOT NULL,
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"updated_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX IF NOT EXISTS "idx_users_oidc_su_5aec5a" ON "users" ("oidc_subject");
CREATE TABLE IF NOT EXISTS "conversations" (
"id" UUID NOT NULL PRIMARY KEY,
"name" VARCHAR(255) NOT NULL,
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"updated_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"user_id" UUID REFERENCES "users" ("id") ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS "conversation_messages" (
"id" UUID NOT NULL PRIMARY KEY,
"text" TEXT NOT NULL,
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"speaker" VARCHAR(10) NOT NULL,
"conversation_id" UUID NOT NULL REFERENCES "conversations" ("id") ON DELETE CASCADE
);
COMMENT ON COLUMN "conversation_messages"."speaker" IS 'USER: user\nSIMBA: simba';
CREATE TABLE IF NOT EXISTS "aerich" (
"id" SERIAL NOT NULL PRIMARY KEY,
"version" VARCHAR(255) NOT NULL,
"app" VARCHAR(100) NOT NULL,
"content" JSONB NOT NULL
);"""
async def downgrade(db: BaseDBAsyncClient) -> str:
return """
"""
MODELS_STATE = (
"eJztmm1v4jgQx78Kyquu1KtatnRX1emkQOkttwuceNinXhWZ2ICviZ1NnG1R1e9+tkmIkz"
"gUKFDY401bxh5s/zzO/Mfpo+FSiJzgpEbJT+QHgGFKjMvSo0GAi/gf2vbjkgE8L2kVBgYG"
"jnSwlZ6yBQwC5gOb8cYhcALETRAFto+9aDASOo4wUpt3xGSUmEKCf4TIYnSE2Bj5vOHmlp"
"sxgegBBfFH784aYuTA1LwxFGNLu8UmnrT1+42ra9lTDDewbOqELkl6exM2pmTWPQwxPBE+"
"om2ECPIBQ1BZhphltOzYNJ0xNzA/RLOpwsQA0RCEjoBh/D4MiS0YlORI4sf5H8YSeDhqgR"
"YTJlg8Pk1XlaxZWg0xVO2D2Tl6e/FGrpIGbOTLRknEeJKOgIGpq+SagJS/cyhrY+DrUcb9"
"MzD5RFfBGBsSjkkMxSBjQKtRM1zwYDmIjNiYfyxXKnMwfjY7kiTvJVFSHtfTqG9FTeVpm0"
"CaILR9JJZsAZYHecVbGHaRHmbaM4MURq4n8R87CpivAbaJM4kOwRy+vUaz3u2Zzb/FStwg"
"+OFIRGavLlrK0jrJWI8uMlsx+5LSl0bvQ0l8LH1vt+rZ2J/16303xJxAyKhF6L0FoHJeY2"
"sMJrWxoQdX3Ni052FjX3Vjo8kr+xog31ougyguL0gj0dy2uImrJw2Reod32pwhYOThXVMf"
"4RH5iCYSYYPPAxBblywi0dGPvmZXoSXWZBY+uJ+pETUo+Or4mhCbZk+zWzOv6oZkOAD23T"
"3woVUA00VBAEYoyAOtRp7XHzvImUkzPUtVwDWn37ibT5UitpIVLVOFUYpevsktu1kLIHzd"
"MBpbjDSHzjMqWIG4mBi21I08iOK9FsUMPWhSfo9b9Sjj/vsiiuel8vrXXiqLx9L3qGl+fZ"
"PK5J/arT/j7opUrn1qVw8K+VcUUnmFHHgI3OnEgCgg6yR0c1IgtbuK+ysfHaPfrXcuSyKj"
"/0O6jWbVvCwF2B0AY7EtTlWZZ6cLFJlnp4U1pmjKHCA10Sz3mNe4rvOZv6cS1s5ceL1Qym"
"bvz3aW4rOaVhMuy2rbTSo5WTNopFtcSxRrNXG0D9ps/7WZ2MdlLy1Vn33RaFu4uPRAENxT"
"XxOZVUyAP9HDVL0yMAcTNq1/drWk18GrCr2qyi2OrNpomZ1veskb91fjtvqtVzczdJELsL"
"NMlM4c1hOiz5/4dQbo2eliomee6snJHoqhbQXh4F9kayqHYpJZv5WAZoN0uzw3cuC5lh9b"
"nk9/Ylgk2vVAc47be4oaDrWB84I0lOZaWSRMK8VRWskFqQOBZ418GnqaO7y/uu2WHmnGLQ"
"O0T/gqbyC22XHJwQG73Rjem9vNpHix8vkXCdk7g8wzVXzB4SLhf3KRcHjV9kts7OwmP1cQ"
"PvcaJPd/Jet5F7LLYnS770BM5GN7bGhq56jleF71DJI+O1M+N0jBdby2ehaYM8EQ7fyrim"
"j5Juq38tn5u/P3by/O3/MuciYzy7s5D4NGq/dMtSwOgvaKq1jrKS6HWjmRzvxoLCOYp933"
"E+BGajk+IkNEk96LJbLi8lryeGO3DmuTx0tk2/Wnl6f/AHvgrXs="
)

25
mkdocs.yml Normal file
View File

@@ -0,0 +1,25 @@
site_name: SimbaRAG Documentation
site_description: Documentation for SimbaRAG - RAG-powered conversational AI
theme:
name: material
features:
- content.code.copy
- navigation.sections
- navigation.expand
markdown_extensions:
- admonition
- pymdownx.highlight:
anchor_linenums: true
- pymdownx.superfences
- pymdownx.tabbed:
alternate_style: true
- tables
- toc:
permalink: true
nav:
- Home: index.md
- Architecture:
- Authentication: authentication.md

42
pyproject.toml Normal file
View File

@@ -0,0 +1,42 @@
[project]
name = "raggr"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.13"
dependencies = [
"chromadb>=1.1.0",
"python-dotenv>=1.0.0",
"flask>=3.1.2",
"httpx>=0.28.1",
"openai>=2.0.1",
"pydantic>=2.11.9",
"pillow>=10.0.0",
"pymupdf>=1.24.0",
"black>=25.9.0",
"pillow-heif>=1.1.1",
"flask-jwt-extended>=4.7.1",
"bcrypt>=5.0.0",
"pony>=0.7.19",
"flask-login>=0.6.3",
"quart>=0.20.0",
"tortoise-orm>=0.25.1",
"quart-jwt-extended>=0.1.0",
"pre-commit>=4.3.0",
"tortoise-orm-stubs>=1.0.2",
"aerich>=0.8.0",
"tomlkit>=0.13.3",
"authlib>=1.3.0",
"asyncpg>=0.30.0",
"langchain-openai>=1.1.6",
"langchain>=1.2.0",
"langchain-chroma>=1.0.0",
"langchain-community>=0.4.1",
"jq>=1.10.0",
"tavily-python>=0.7.17",
]
[tool.aerich]
tortoise_orm = "app.TORTOISE_CONFIG"
location = "./migrations"
src_folder = "./."

View File

@@ -8,8 +8,11 @@ COPY package.json yarn.lock* ./
# Install dependencies # Install dependencies
RUN yarn install RUN yarn install
# Copy application source code
COPY . .
# Expose rsbuild dev server port (default 3000) # Expose rsbuild dev server port (default 3000)
EXPOSE 3000 EXPOSE 3000
# The actual source code will be mounted as a volume # Default command
# CMD will be specified in docker-compose CMD ["sh", "-c", "yarn build && yarn watch:build"]

View File

@@ -37,7 +37,7 @@ class ConversationService {
conversation_id: string, conversation_id: string,
): Promise<QueryResponse> { ): Promise<QueryResponse> {
const response = await userService.fetchWithRefreshToken( const response = await userService.fetchWithRefreshToken(
`${this.baseUrl}/query`, `${this.conversationBaseUrl}/query`,
{ {
method: "POST", method: "POST",
body: JSON.stringify({ query, conversation_id }), body: JSON.stringify({ query, conversation_id }),

View File

Before

Width:  |  Height:  |  Size: 5.8 KiB

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

Before

Width:  |  Height:  |  Size: 163 B

After

Width:  |  Height:  |  Size: 163 B

View File

@@ -40,6 +40,7 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
const [selectedConversation, setSelectedConversation] = const [selectedConversation, setSelectedConversation] =
useState<Conversation | null>(null); useState<Conversation | null>(null);
const [sidebarCollapsed, setSidebarCollapsed] = useState<boolean>(false); const [sidebarCollapsed, setSidebarCollapsed] = useState<boolean>(false);
const [isLoading, setIsLoading] = useState<boolean>(false);
const messagesEndRef = useRef<HTMLDivElement>(null); const messagesEndRef = useRef<HTMLDivElement>(null);
const simbaAnswers = ["meow.", "hiss...", "purrrrrr", "yowOWROWWowowr"]; const simbaAnswers = ["meow.", "hiss...", "purrrrrr", "yowOWROWWowowr"];
@@ -80,6 +81,7 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
setConversations(parsedConversations); setConversations(parsedConversations);
setSelectedConversation(parsedConversations[0]); setSelectedConversation(parsedConversations[0]);
console.log(parsedConversations); console.log(parsedConversations);
console.log("JELLYFISH@");
} catch (error) { } catch (error) {
console.error("Failed to load messages:", error); console.error("Failed to load messages:", error);
} }
@@ -104,11 +106,18 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
useEffect(() => { useEffect(() => {
const loadMessages = async () => { const loadMessages = async () => {
console.log(selectedConversation);
console.log("JELLYFISH");
if (selectedConversation == null) return; if (selectedConversation == null) return;
try { try {
const conversation = await conversationService.getConversation( const conversation = await conversationService.getConversation(
selectedConversation.id, selectedConversation.id,
); );
// Update the conversation title in case it changed
setSelectedConversation({
id: conversation.id,
title: conversation.name,
});
setMessages( setMessages(
conversation.messages.map((message) => ({ conversation.messages.map((message) => ({
text: message.text, text: message.text,
@@ -120,14 +129,15 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
} }
}; };
loadMessages(); loadMessages();
}, [selectedConversation]); }, [selectedConversation?.id]);
const handleQuestionSubmit = async () => { const handleQuestionSubmit = async () => {
if (!query.trim()) return; // Don't submit empty messages if (!query.trim() || isLoading) return; // Don't submit empty messages or while loading
const currMessages = messages.concat([{ text: query, speaker: "user" }]); const currMessages = messages.concat([{ text: query, speaker: "user" }]);
setMessages(currMessages); setMessages(currMessages);
setQuery(""); // Clear input immediately after submission setQuery(""); // Clear input immediately after submission
setIsLoading(true);
if (simbaMode) { if (simbaMode) {
console.log("simba mode activated"); console.log("simba mode activated");
@@ -142,6 +152,7 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
}, },
]), ]),
); );
setIsLoading(false);
return; return;
} }
@@ -162,6 +173,8 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
if (error instanceof Error && error.message.includes("Session expired")) { if (error instanceof Error && error.message.includes("Session expired")) {
setAuthenticated(false); setAuthenticated(false);
} }
} finally {
setIsLoading(false);
} }
}; };
@@ -180,9 +193,11 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
return ( return (
<div className="h-screen flex flex-row bg-[#F9F5EB]"> <div className="h-screen flex flex-row bg-[#F9F5EB]">
{/* Sidebar - Expanded */} {/* Sidebar - Expanded */}
<aside className={`hidden md:flex md:flex-col bg-white border-r border-gray-200 p-4 overflow-y-auto transition-all duration-300 ${sidebarCollapsed ? 'w-20' : 'w-64'}`}> <aside
className={`hidden md:flex md:flex-col bg-[#F9F5EB] border-r border-gray-200 p-4 overflow-y-auto transition-all duration-300 ${sidebarCollapsed ? "w-20" : "w-64"}`}
>
{!sidebarCollapsed ? ( {!sidebarCollapsed ? (
<> <div className="bg-[#F9F5EB]">
<div className="flex flex-row items-center gap-2 mb-6"> <div className="flex flex-row items-center gap-2 mb-6">
<img <img
src={catIcon} src={catIcon}
@@ -205,7 +220,7 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
logout logout
</button> </button>
</div> </div>
</> </div>
) : ( ) : (
<div className="flex flex-col items-center gap-4"> <div className="flex flex-col items-center gap-4">
<img <img
@@ -243,7 +258,18 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
</header> </header>
{/* Messages area */} {/* Messages area */}
<div className="flex-1 overflow-y-auto px-4 py-4"> {selectedConversation && (
<div className="sticky top-0 mx-auto w-full">
<div className="bg-[#F9F5EB] text-black px-6 w-full py-3">
<h2 className="text-lg font-semibold">
{selectedConversation.title || "Untitled Conversation"}
</h2>
</div>
</div>
)}
<div className="flex-1 overflow-y-auto relative px-4 py-6">
{/* Floating conversation name */}
<div className="max-w-2xl mx-auto flex flex-col gap-4"> <div className="max-w-2xl mx-auto flex flex-col gap-4">
{showConversations && ( {showConversations && (
<div className="md:hidden"> <div className="md:hidden">
@@ -260,6 +286,7 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
} }
return <QuestionBubble key={index} text={msg.text} />; return <QuestionBubble key={index} text={msg.text} />;
})} })}
{isLoading && <AnswerBubble text="" loading={true} />}
<div ref={messagesEndRef} /> <div ref={messagesEndRef} />
</div> </div>
</div> </div>
@@ -273,6 +300,7 @@ export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
handleKeyDown={handleKeyDown} handleKeyDown={handleKeyDown}
handleQuestionSubmit={handleQuestionSubmit} handleQuestionSubmit={handleQuestionSubmit}
setSimbaMode={setSimbaMode} setSimbaMode={setSimbaMode}
isLoading={isLoading}
/> />
</div> </div>
</footer> </footer>

View File

@@ -0,0 +1,56 @@
import { useEffect, useState, useRef } from "react";
type MessageInputProps = {
handleQueryChange: (event: React.ChangeEvent<HTMLTextAreaElement>) => void;
handleKeyDown: (event: React.ChangeEvent<HTMLTextAreaElement>) => void;
handleQuestionSubmit: () => void;
setSimbaMode: (sdf: boolean) => void;
query: string;
isLoading: boolean;
};
export const MessageInput = ({
query,
handleKeyDown,
handleQueryChange,
handleQuestionSubmit,
setSimbaMode,
isLoading,
}: MessageInputProps) => {
return (
<div className="flex flex-col gap-4 sticky bottom-0 bg-[#3D763A] p-6 rounded-xl">
<div className="flex flex-row justify-between grow">
<textarea
className="p-3 sm:p-4 border border-blue-200 rounded-md grow bg-[#F9F5EB] min-h-[44px] resize-y"
onChange={handleQueryChange}
onKeyDown={handleKeyDown}
value={query}
rows={2}
placeholder="Type your message... (Press Enter to send, Shift+Enter for new line)"
/>
</div>
<div className="flex flex-row justify-between gap-2 grow">
<button
className={`p-3 sm:p-4 min-h-[44px] border border-blue-400 rounded-md flex-grow text-sm sm:text-base ${
isLoading
? "bg-gray-400 cursor-not-allowed opacity-50"
: "bg-[#EDA541] hover:bg-blue-400 cursor-pointer"
}`}
onClick={() => handleQuestionSubmit()}
type="submit"
disabled={isLoading}
>
{isLoading ? "Sending..." : "Submit"}
</button>
</div>
<div className="flex flex-row justify-center gap-2 grow items-center">
<input
type="checkbox"
onChange={(event) => setSimbaMode(event.target.checked)}
className="w-5 h-5 cursor-pointer"
/>
<p className="text-sm sm:text-base">simba mode?</p>
</div>
</div>
);
};

View File

Before

Width:  |  Height:  |  Size: 3.4 MiB

After

Width:  |  Height:  |  Size: 3.4 MiB

View File

Before

Width:  |  Height:  |  Size: 2.1 MiB

After

Width:  |  Height:  |  Size: 2.1 MiB

0
scripts/__init__.py Normal file
View File

View File

@@ -1,18 +1,21 @@
import httpx
import os
from pathlib import Path
import logging import logging
import tempfile import os
from image_process import describe_simba_image
from request import PaperlessNGXService
import sqlite3 import sqlite3
import httpx
from dotenv import load_dotenv
import sys
from pathlib import Path
# Add parent directory to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent))
from utils.image_process import describe_simba_image
from utils.request import PaperlessNGXService
logging.basicConfig(level=logging.INFO) logging.basicConfig(level=logging.INFO)
from dotenv import load_dotenv
load_dotenv() load_dotenv()
# Configuration from environment variables # Configuration from environment variables
@@ -89,7 +92,7 @@ if __name__ == "__main__":
image_date = description.image_date image_date = description.image_date
description_filepath = os.path.join( description_filepath = os.path.join(
"/Users/ryanchen/Programs/raggr", f"SIMBA_DESCRIBE_001.txt" "/Users/ryanchen/Programs/raggr", "SIMBA_DESCRIBE_001.txt"
) )
file = open(description_filepath, "w+") file = open(description_filepath, "w+")
file.write(image_description) file.write(image_description)

View File

@@ -0,0 +1,92 @@
#!/usr/bin/env python3
"""CLI tool to inspect the vector store contents."""
import argparse
import os
from dotenv import load_dotenv
from blueprints.rag.logic import (
get_vector_store_stats,
index_documents,
list_all_documents,
)
# Load .env from the root directory
root_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "../.."))
env_path = os.path.join(root_dir, ".env")
load_dotenv(env_path)
def print_stats():
"""Print vector store statistics."""
stats = get_vector_store_stats()
print("=== Vector Store Statistics ===")
print(f"Collection Name: {stats['collection_name']}")
print(f"Total Documents: {stats['total_documents']}")
print()
def print_documents(limit: int = 10, show_content: bool = False):
"""Print documents in the vector store."""
docs = list_all_documents(limit=limit)
print(f"=== Documents (showing {len(docs)} of {limit} requested) ===\n")
for i, doc in enumerate(docs, 1):
print(f"Document {i}:")
print(f" ID: {doc['id']}")
print(f" Metadata: {doc['metadata']}")
if show_content:
print(f" Content Preview: {doc['content_preview']}")
print()
async def run_index():
"""Run the indexing process."""
print("Starting indexing process...")
await index_documents()
print("Indexing complete!")
print_stats()
def main():
import asyncio
parser = argparse.ArgumentParser(description="Inspect the vector store contents")
parser.add_argument(
"--stats", action="store_true", help="Show vector store statistics"
)
parser.add_argument(
"--list", type=int, metavar="N", help="List N documents from the vector store"
)
parser.add_argument(
"--show-content",
action="store_true",
help="Show content preview when listing documents",
)
parser.add_argument(
"--index",
action="store_true",
help="Index documents from Paperless-NGX into the vector store",
)
args = parser.parse_args()
# Handle indexing first if requested
if args.index:
asyncio.run(run_index())
return
# If no arguments provided, show stats by default
if not any([args.stats, args.list]):
args.stats = True
if args.stats:
print_stats()
if args.list:
print_documents(limit=args.list, show_content=args.show_content)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,121 @@
#!/usr/bin/env python3
"""Management script for vector store operations."""
import argparse
import asyncio
import sys
from blueprints.rag.logic import (
get_vector_store_stats,
index_documents,
list_all_documents,
vector_store,
)
def stats():
"""Show vector store statistics."""
stats = get_vector_store_stats()
print("=== Vector Store Statistics ===")
print(f"Collection: {stats['collection_name']}")
print(f"Total Documents: {stats['total_documents']}")
async def index():
"""Index documents from Paperless-NGX."""
print("Starting indexing process...")
print("Fetching documents from Paperless-NGX...")
await index_documents()
print("✓ Indexing complete!")
stats()
async def reindex():
"""Clear and reindex all documents."""
print("Clearing existing documents...")
collection = vector_store._collection
all_docs = collection.get()
if all_docs["ids"]:
print(f"Deleting {len(all_docs['ids'])} existing documents...")
collection.delete(ids=all_docs["ids"])
print("✓ Cleared")
else:
print("Collection is already empty")
await index()
def list_docs(limit: int = 10, show_content: bool = False):
"""List documents in the vector store."""
docs = list_all_documents(limit=limit)
print(f"\n=== Documents (showing {len(docs)}) ===\n")
for i, doc in enumerate(docs, 1):
print(f"Document {i}:")
print(f" ID: {doc['id']}")
print(f" Metadata: {doc['metadata']}")
if show_content:
print(f" Content: {doc['content_preview']}")
print()
def main():
parser = argparse.ArgumentParser(
description="Manage vector store for RAG system",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
%(prog)s stats # Show vector store statistics
%(prog)s index # Index new documents from Paperless-NGX
%(prog)s reindex # Clear and reindex all documents
%(prog)s list 10 # List first 10 documents
%(prog)s list 20 --show-content # List 20 documents with content preview
""",
)
subparsers = parser.add_subparsers(dest="command", help="Command to execute")
# Stats command
subparsers.add_parser("stats", help="Show vector store statistics")
# Index command
subparsers.add_parser("index", help="Index documents from Paperless-NGX")
# Reindex command
subparsers.add_parser("reindex", help="Clear and reindex all documents")
# List command
list_parser = subparsers.add_parser("list", help="List documents in vector store")
list_parser.add_argument(
"limit", type=int, default=10, nargs="?", help="Number of documents to list"
)
list_parser.add_argument(
"--show-content", action="store_true", help="Show content preview"
)
args = parser.parse_args()
if not args.command:
parser.print_help()
sys.exit(1)
try:
if args.command == "stats":
stats()
elif args.command == "index":
asyncio.run(index())
elif args.command == "reindex":
asyncio.run(reindex())
elif args.command == "list":
list_docs(limit=args.limit, show_content=args.show_content)
except KeyboardInterrupt:
print("\n\nOperation cancelled by user")
sys.exit(1)
except Exception as e:
print(f"\n❌ Error: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,18 +1,11 @@
import json import json
import os
from typing import Literal from typing import Literal
import datetime import datetime
from ollama import Client
from openai import OpenAI from openai import OpenAI
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
# Configure ollama client with URL from environment or default to localhost
ollama_client = Client(
host=os.getenv("OLLAMA_URL", "http://localhost:11434"), timeout=10.0
)
# This uses inferred filters — which means using LLM to create the metadata filters # This uses inferred filters — which means using LLM to create the metadata filters

39
scripts/test_query.py Normal file
View File

@@ -0,0 +1,39 @@
#!/usr/bin/env python3
"""Test the query_vector_store function."""
import asyncio
import os
from dotenv import load_dotenv
from blueprints.rag.logic import query_vector_store
# Load .env from the root directory
root_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "../.."))
env_path = os.path.join(root_dir, ".env")
load_dotenv(env_path)
async def test_query(query: str):
"""Test a query against the vector store."""
print(f"Query: {query}\n")
result, docs = await query_vector_store(query)
print(f"Found {len(docs)} documents\n")
print("Serialized result:")
print(result)
print("\n" + "=" * 80 + "\n")
async def main():
queries = [
"What is Simba's weight?",
"What medications is Simba taking?",
"Tell me about Simba's recent vet visits",
]
for query in queries:
await test_query(query)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,79 @@
#!/usr/bin/env python3
"""
Script to show how many messages each user has written
"""
import asyncio
from tortoise import Tortoise
from blueprints.users.models import User
from blueprints.conversation.models import Speaker
import os
async def get_user_message_stats():
"""Get message count statistics per user"""
# Initialize database connection
database_url = os.getenv("DATABASE_URL", "sqlite://raggr.db")
await Tortoise.init(
db_url=database_url,
modules={
"models": [
"blueprints.users.models",
"blueprints.conversation.models",
]
},
)
print("\n📊 User Message Statistics\n")
print(
f"{'Username':<20} {'Total Messages':<15} {'User Messages':<15} {'Conversations':<15}"
)
print("=" * 70)
# Get all users
users = await User.all()
total_users = 0
total_messages = 0
for user in users:
# Get all conversations for this user
conversations = await user.conversations.all()
if not conversations:
continue
total_users += 1
# Count messages across all conversations
user_message_count = 0
total_message_count = 0
for conversation in conversations:
messages = await conversation.messages.all()
total_message_count += len(messages)
# Count only user messages (not assistant responses)
user_messages = [msg for msg in messages if msg.speaker == Speaker.USER]
user_message_count += len(user_messages)
total_messages += user_message_count
print(
f"{user.username:<20} {total_message_count:<15} {user_message_count:<15} {len(conversations):<15}"
)
print("=" * 70)
print("\n📈 Summary:")
print(f" Total active users: {total_users}")
print(f" Total user messages: {total_messages}")
print(
f" Average messages per user: {total_messages / total_users if total_users > 0 else 0:.1f}\n"
)
await Tortoise.close_connections()
if __name__ == "__main__":
asyncio.run(get_user_message_stats())

View File

@@ -1,16 +0,0 @@
.git
.gitignore
README.md
.env
.DS_Store
chromadb/
chroma_db/
raggr-frontend/node_modules/
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
.venv/
venv/
.pytest_cache/

View File

@@ -1 +0,0 @@
3.13

View File

@@ -1,33 +0,0 @@
FROM python:3.13-slim
WORKDIR /app
# Install system dependencies and uv
RUN apt-get update && apt-get install -y \
build-essential \
curl \
&& rm -rf /var/lib/apt/lists/* \
&& curl -LsSf https://astral.sh/uv/install.sh | sh
# Add uv to PATH
ENV PATH="/root/.local/bin:$PATH"
# Copy dependency files
COPY pyproject.toml ./
# Install Python dependencies using uv
RUN uv pip install --system -e .
# Create ChromaDB and database directories
RUN mkdir -p /app/chromadb /app/database
# Expose port
EXPOSE 8080
# Set environment variables
ENV PYTHONPATH=/app
ENV CHROMADB_PATH=/app/chromadb
ENV PYTHONUNBUFFERED=1
# The actual source code will be mounted as a volume
# No CMD here - will be specified in docker-compose

View File

@@ -1,72 +0,0 @@
import datetime
from quart_jwt_extended import (
jwt_refresh_token_required,
get_jwt_identity,
)
from quart import Blueprint, jsonify
from .models import (
Conversation,
PydConversation,
PydListConversation,
)
import blueprints.users.models
conversation_blueprint = Blueprint(
"conversation_api", __name__, url_prefix="/api/conversation"
)
@conversation_blueprint.route("/<conversation_id>")
async def get_conversation(conversation_id: str):
conversation = await Conversation.get(id=conversation_id)
await conversation.fetch_related("messages")
# Manually serialize the conversation with messages
messages = []
for msg in conversation.messages:
messages.append(
{
"id": str(msg.id),
"text": msg.text,
"speaker": msg.speaker.value,
"created_at": msg.created_at.isoformat(),
}
)
return jsonify(
{
"id": str(conversation.id),
"name": conversation.name,
"messages": messages,
"created_at": conversation.created_at.isoformat(),
"updated_at": conversation.updated_at.isoformat(),
}
)
@conversation_blueprint.post("/")
@jwt_refresh_token_required
async def create_conversation():
user_uuid = get_jwt_identity()
user = await blueprints.users.models.User.get(id=user_uuid)
conversation = await Conversation.create(
name=f"{user.username} {datetime.datetime.now().timestamp}",
user=user,
)
serialized_conversation = await PydConversation.from_tortoise_orm(conversation)
return jsonify(serialized_conversation.model_dump())
@conversation_blueprint.get("/")
@jwt_refresh_token_required
async def get_all_conversations():
user_uuid = get_jwt_identity()
user = await blueprints.users.models.User.get(id=user_uuid)
conversations = Conversation.filter(user=user)
serialized_conversations = await PydListConversation.from_queryset(conversations)
return jsonify(serialized_conversations.model_dump())

View File

@@ -1,73 +0,0 @@
import os
from ollama import Client
from openai import OpenAI
import logging
from dotenv import load_dotenv
load_dotenv()
logging.basicConfig(level=logging.INFO)
TRY_OLLAMA = os.getenv("TRY_OLLAMA", False)
class LLMClient:
def __init__(self):
try:
self.ollama_client = Client(
host=os.getenv("OLLAMA_URL", "http://localhost:11434"), timeout=1.0
)
self.ollama_client.chat(
model="gemma3:4b", messages=[{"role": "system", "content": "test"}]
)
self.PROVIDER = "ollama"
logging.info("Using Ollama as LLM backend")
except Exception as e:
print(e)
self.openai_client = OpenAI()
self.PROVIDER = "openai"
logging.info("Using OpenAI as LLM backend")
def chat(
self,
prompt: str,
system_prompt: str,
):
# Instituting a fallback if my gaming PC is not on
if self.PROVIDER == "ollama":
try:
response = self.ollama_client.chat(
model="gemma3:4b",
messages=[
{
"role": "system",
"content": system_prompt,
},
{"role": "user", "content": prompt},
],
)
output = response.message.content
return output
except Exception as e:
logging.error(f"Could not connect to OLLAMA: {str(e)}")
response = self.openai_client.responses.create(
model="gpt-4o-mini",
input=[
{
"role": "system",
"content": system_prompt,
},
{"role": "user", "content": prompt},
],
)
output = response.output_text
return output
if __name__ == "__main__":
client = Client()
client.chat(model="gemma3:4b", messages=[{"role": "system", "promp": "hack"}])

View File

@@ -1,71 +0,0 @@
from tortoise import BaseDBAsyncClient
RUN_IN_TRANSACTION = True
async def upgrade(db: BaseDBAsyncClient) -> str:
return """
CREATE TABLE IF NOT EXISTS "users" (
"id" UUID NOT NULL PRIMARY KEY,
"username" VARCHAR(255) NOT NULL,
"password" BYTEA,
"email" VARCHAR(100) NOT NULL UNIQUE,
"oidc_subject" VARCHAR(255) UNIQUE,
"auth_provider" VARCHAR(50) NOT NULL DEFAULT 'local',
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"updated_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX IF NOT EXISTS "idx_users_oidc_su_5aec5a" ON "users" ("oidc_subject");
CREATE TABLE IF NOT EXISTS "conversations" (
"id" UUID NOT NULL PRIMARY KEY,
"name" VARCHAR(255) NOT NULL,
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"updated_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"user_id" UUID REFERENCES "users" ("id") ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS "conversation_messages" (
"id" UUID NOT NULL PRIMARY KEY,
"text" TEXT NOT NULL,
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"speaker" VARCHAR(10) NOT NULL,
"conversation_id" UUID NOT NULL REFERENCES "conversations" ("id") ON DELETE CASCADE
);
COMMENT ON COLUMN "conversation_messages"."speaker" IS 'USER: user\nSIMBA: simba';
CREATE TABLE IF NOT EXISTS "aerich" (
"id" SERIAL NOT NULL PRIMARY KEY,
"version" VARCHAR(255) NOT NULL,
"app" VARCHAR(100) NOT NULL,
"content" JSONB NOT NULL
);"""
async def downgrade(db: BaseDBAsyncClient) -> str:
return """
"""
MODELS_STATE = (
"eJztmmtP4zgUhv9KlE+MxCLoUGaEViulpex0Z9qO2nR3LjuK3MRtvSROJnYGKsR/X9u5J0"
"56AUqL+gXosU9sPz7OeY/Lveq4FrTJSdvFv6BPAEUuVi+VexUDB7I/pO3Higo8L23lBgom"
"tnAwMz1FC5gQ6gOTssYpsAlkJgsS00deNBgObJsbXZN1RHiWmgKMfgbQoO4M0jn0WcP3H8"
"yMsAXvIIk/ejfGFEHbys0bWXxsYTfowhO28bh7dS168uEmhunagYPT3t6Czl2cdA8CZJ1w"
"H942gxj6gEIrsww+y2jZsSmcMTNQP4DJVK3UYMEpCGwOQ/19GmCTM1DESPzH+R/qGngYao"
"4WYcpZ3D+Eq0rXLKwqH6r9QRsevb14I1bpEjrzRaMgoj4IR0BB6Cq4piDF7xLK9hz4cpRx"
"/wJMNtFNMMaGlGMaQzHIGNBm1FQH3Bk2xDM6Zx8bzWYNxr+1oSDJegmULovrMOr7UVMjbO"
"NIU4SmD/mSDUDLIK9YC0UOlMPMexaQWpHrSfzHjgJma7AG2F5Eh6CGr97tdUa61vvMV+IQ"
"8tMWiDS9w1sawrooWI8uCluRPET5p6t/UPhH5dug3ynGftJP/6byOYGAugZ2bw1gZc5rbI"
"3B5DY28KwNNzbvedjYF93YaPKZfSXQN9bLIBmXR6SRaG5b3MTNkwZPvdMbac7gMMrwrl0f"
"ohn+CBcCYZfNA2BTliwi0TGOHrOr0FJrOgsf3CZqJBsUbHVsTZCG2VMbtbWrjioYToB5cw"
"t8y6iA6UBCwAySMtBW5Hn9cQjtRJrJWWYFXC984m6+VarYClZuw80wytErNzkNp2gBmK3b"
"isbmI9XQWaKCMxBXE8NGdiMPonivRTGFd5KUrzOrHGXcf19EcV0q73zRc1k8lr5HPe3Lm1"
"wm/zTo/xl3z0jl9qdB66CQX6OQKitk4kFwIxMDvIDs4MApSYHc7mbcX/joqONRZ3ip8Iz+"
"Lx51ey3tUiHImQB1tS3OVZlnpysUmWenlTUmbyocoGyiWe81L3F9ynf+nkpYs3Dh9UgpW7"
"w/21mKSzWtJFzW1bbPqeREzSCRbnEtUa3V+NE+aLP912Z8H9e9tMz67ItG28LFpQcIuXV9"
"SWS2EAb+Qg4z61WAOVnQsP7Z1ZJeBq/F9WpWbjFkrW5fG36VS964fzZuW1/1jlagCx2A7H"
"WiNHF4mhBdfuKfMkDPTlcTPXWqpyR7XGSZBgkm/0FTUjlUkyz6bQS0GKTb5fksB55p+bnh"
"+e4vZFWJdjnQkuP23qKq7ZrAfkQaynNtrhKmzeoobZa1+aG4fZ3F7eHrn1exscntcqlIWX"
"Y1X/pfh6e5n99lgbTde3kN+sicq5J6Lmo5rqvoQNpnZ0q6Lq64IpZWdBxzIRiinX9RYSe+"
"HfmtcXb+7vz924vz96yLmElieVfzMuj29SUVHD8I0muXav2RcTnUb6mcY0djHREXdt9PgM"
"9SX7ARKcSS9P7XaNCvvE6NXQogx5gt8LuFTHqs2IjQH7uJtYYiX3X9dz/Fr3kKuZk/oCW7"
"eN3mZeHD/9BpOYI="
)

View File

@@ -1,12 +0,0 @@
[project]
name = "raggr"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.13"
dependencies = ["chromadb>=1.1.0", "python-dotenv>=1.0.0", "flask>=3.1.2", "httpx>=0.28.1", "ollama>=0.6.0", "openai>=2.0.1", "pydantic>=2.11.9", "pillow>=10.0.0", "pymupdf>=1.24.0", "black>=25.9.0", "pillow-heif>=1.1.1", "flask-jwt-extended>=4.7.1", "bcrypt>=5.0.0", "pony>=0.7.19", "flask-login>=0.6.3", "quart>=0.20.0", "tortoise-orm>=0.25.1", "quart-jwt-extended>=0.1.0", "pre-commit>=4.3.0", "tortoise-orm-stubs>=1.0.2", "aerich>=0.8.0", "tomlkit>=0.13.3", "authlib>=1.3.0", "asyncpg>=0.30.0"]
[tool.aerich]
tortoise_orm = "app.TORTOISE_CONFIG"
location = "./migrations"
src_folder = "./."

View File

@@ -1,43 +0,0 @@
import { useEffect, useState, useRef } from "react";
type MessageInputProps = {
handleQueryChange: (event: React.ChangeEvent<HTMLTextAreaElement>) => void;
handleKeyDown: (event: React.ChangeEvent<HTMLTextAreaElement>) => void;
handleQuestionSubmit: () => void;
setSimbaMode: (sdf: boolean) => void;
query: string;
}
export const MessageInput = ({query, handleKeyDown, handleQueryChange, handleQuestionSubmit, setSimbaMode}: MessageInputProps) => {
return (
<div className="flex flex-col gap-4 sticky bottom-0 bg-[#3D763A] p-6 rounded-xl">
<div className="flex flex-row justify-between grow">
<textarea
className="p-3 sm:p-4 border border-blue-200 rounded-md grow bg-[#F9F5EB] min-h-[44px] resize-y"
onChange={handleQueryChange}
onKeyDown={handleKeyDown}
value={query}
rows={2}
placeholder="Type your message... (Press Enter to send, Shift+Enter for new line)"
/>
</div>
<div className="flex flex-row justify-between gap-2 grow">
<button
className="p-3 sm:p-4 min-h-[44px] border border-blue-400 bg-[#EDA541] hover:bg-blue-400 cursor-pointer rounded-md flex-grow text-sm sm:text-base"
onClick={() => handleQuestionSubmit()}
type="submit"
>
Submit
</button>
</div>
<div className="flex flex-row justify-center gap-2 grow items-center">
<input
type="checkbox"
onChange={(event) => setSimbaMode(event.target.checked)}
className="w-5 h-5 cursor-pointer"
/>
<p className="text-sm sm:text-base">simba mode?</p>
</div>
</div>
);
}

View File

@@ -2,13 +2,12 @@
set -e set -e
echo "Initializing directories..." echo "Initializing directories..."
mkdir -p /app/chromadb mkdir -p /app/data/chromadb
echo "Waiting for frontend to build..." echo "Rebuilding frontend..."
while [ ! -f /app/raggr-frontend/dist/index.html ]; do cd /app/raggr-frontend
sleep 1 yarn build
done cd /app
echo "Frontend built successfully!"
echo "Setting up database..." echo "Setting up database..."
# Give PostgreSQL a moment to be ready (healthcheck in docker-compose handles this) # Give PostgreSQL a moment to be ready (healthcheck in docker-compose handles this)
@@ -22,8 +21,5 @@ else
aerich init-db aerich init-db
fi fi
echo "Starting reindex process..."
python main.py "" --reindex || echo "Reindex failed, continuing anyway..."
echo "Starting Flask application in debug mode..." echo "Starting Flask application in debug mode..."
python app.py python app.py

0
utils/__init__.py Normal file
View File

View File

@@ -3,7 +3,6 @@ from math import ceil
import re import re
from typing import Union from typing import Union
from uuid import UUID, uuid4 from uuid import UUID, uuid4
from ollama import Client
from chromadb.utils.embedding_functions.openai_embedding_function import ( from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction, OpenAIEmbeddingFunction,
) )
@@ -13,10 +12,6 @@ from llm import LLMClient
load_dotenv() load_dotenv()
ollama_client = Client(
host=os.getenv("OLLAMA_HOST", "http://localhost:11434"), timeout=1.0
)
def remove_headers_footers(text, header_patterns=None, footer_patterns=None): def remove_headers_footers(text, header_patterns=None, footer_patterns=None):
if header_patterns is None: if header_patterns is None:

View File

@@ -8,7 +8,7 @@ import ollama
from PIL import Image from PIL import Image
import fitz import fitz
from request import PaperlessNGXService from .request import PaperlessNGXService
load_dotenv() load_dotenv()

File diff suppressed because it is too large Load Diff