Compare commits
32 Commits
update-fav
...
f68a79bdb7
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f68a79bdb7 | ||
|
|
52153cdf1e | ||
|
|
6eb3775e0f | ||
|
|
b3793d2d32 | ||
|
|
033429798e | ||
|
|
733ffae8cf | ||
|
|
0895668ddd | ||
|
|
07512409f1 | ||
|
|
12eb110313 | ||
|
|
1a026f76a1 | ||
|
|
da3a464897 | ||
|
|
913875188a | ||
|
|
f5e2d68cd2 | ||
|
|
70799ffb7d | ||
|
|
7f1d4fbdda | ||
|
|
5ebdd60ea0 | ||
|
|
289045e7d0 | ||
|
|
ceea83cb54 | ||
|
|
1b60aab97c | ||
|
|
210bfc1476 | ||
|
|
454fb1b52c | ||
|
|
c3f2501585 | ||
|
|
1da21fabee | ||
|
|
dd5690ee53 | ||
|
|
5e7ac28b6f | ||
|
|
29f8894e4a | ||
|
|
19d1df2f68 | ||
|
|
e577cb335b | ||
|
|
591788dfa4 | ||
|
|
561b5bddce | ||
|
|
ddd455a4c6 | ||
|
|
07424e77e0 |
46
.env.example
Normal file
@@ -0,0 +1,46 @@
|
||||
# Database Configuration
|
||||
# PostgreSQL is recommended (required for OIDC features)
|
||||
DATABASE_URL=postgres://raggr:changeme@postgres:5432/raggr
|
||||
|
||||
# PostgreSQL credentials (if using docker-compose postgres service)
|
||||
POSTGRES_USER=raggr
|
||||
POSTGRES_PASSWORD=changeme
|
||||
POSTGRES_DB=raggr
|
||||
|
||||
# JWT Configuration
|
||||
JWT_SECRET_KEY=your-secret-key-here
|
||||
|
||||
# Paperless Configuration
|
||||
PAPERLESS_TOKEN=your-paperless-token
|
||||
BASE_URL=192.168.1.5:8000
|
||||
|
||||
# Ollama Configuration
|
||||
OLLAMA_URL=http://192.168.1.14:11434
|
||||
OLLAMA_HOST=http://192.168.1.14:11434
|
||||
|
||||
# ChromaDB Configuration
|
||||
# For Docker: This is automatically set to /app/data/chromadb
|
||||
# For local development: Set to a local directory path
|
||||
CHROMADB_PATH=./data/chromadb
|
||||
|
||||
# OpenAI Configuration
|
||||
OPENAI_API_KEY=your-openai-api-key
|
||||
|
||||
# Immich Configuration
|
||||
IMMICH_URL=http://192.168.1.5:2283
|
||||
IMMICH_API_KEY=your-immich-api-key
|
||||
SEARCH_QUERY=simba cat
|
||||
DOWNLOAD_DIR=./simba_photos
|
||||
|
||||
# OIDC Configuration (Authelia)
|
||||
OIDC_ISSUER=https://auth.example.com
|
||||
OIDC_CLIENT_ID=simbarag
|
||||
OIDC_CLIENT_SECRET=your-client-secret-here
|
||||
OIDC_REDIRECT_URI=http://localhost:8080/
|
||||
OIDC_USE_DISCOVERY=true
|
||||
|
||||
# Optional: Manual OIDC endpoints (if discovery is disabled)
|
||||
# OIDC_AUTHORIZATION_ENDPOINT=https://auth.example.com/api/oidc/authorization
|
||||
# OIDC_TOKEN_ENDPOINT=https://auth.example.com/api/oidc/token
|
||||
# OIDC_USERINFO_ENDPOINT=https://auth.example.com/api/oidc/userinfo
|
||||
# OIDC_JWKS_URI=https://auth.example.com/api/oidc/jwks
|
||||
9
.gitignore
vendored
@@ -9,5 +9,12 @@ wheels/
|
||||
# Virtual environments
|
||||
.venv
|
||||
|
||||
|
||||
# Environment files
|
||||
.env
|
||||
|
||||
# Database files
|
||||
chromadb/
|
||||
chromadb_openai/
|
||||
chroma_db/
|
||||
database/
|
||||
*.db
|
||||
|
||||
110
DEV-README.md
Normal file
@@ -0,0 +1,110 @@
|
||||
# Development Environment Setup
|
||||
|
||||
This guide explains how to run the application in development mode with hot reload enabled.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Development Mode (Hot Reload)
|
||||
|
||||
```bash
|
||||
# Start all services in development mode
|
||||
docker-compose -f docker-compose.dev.yml up --build
|
||||
|
||||
# Or run in detached mode
|
||||
docker-compose -f docker-compose.dev.yml up -d --build
|
||||
```
|
||||
|
||||
### Production Mode
|
||||
|
||||
```bash
|
||||
# Start production services
|
||||
docker-compose up --build
|
||||
```
|
||||
|
||||
## What's Different in Dev Mode?
|
||||
|
||||
### Backend (Quart/Flask)
|
||||
- **Hot Reload**: Python code changes are automatically detected and the server restarts
|
||||
- **Source Mounted**: Your local `services/raggr` directory is mounted as a volume
|
||||
- **Debug Mode**: Flask runs with `debug=True` for better error messages
|
||||
- **Environment**: `FLASK_ENV=development` and `PYTHONUNBUFFERED=1` for immediate log output
|
||||
|
||||
### Frontend (React + rsbuild)
|
||||
- **Auto Rebuild**: Frontend automatically rebuilds when files change
|
||||
- **Watch Mode**: rsbuild runs in watch mode, rebuilding to `dist/` on save
|
||||
- **Source Mounted**: Your local `services/raggr/raggr-frontend` directory is mounted as a volume
|
||||
- **Served by Backend**: Built files are served by the backend, no separate dev server
|
||||
|
||||
## Ports
|
||||
|
||||
- **Application**: 8080 (accessible at `http://localhost:8080` or `http://YOUR_IP:8080`)
|
||||
|
||||
The backend serves both the API and the auto-rebuilt frontend, making it accessible from other machines on your network.
|
||||
|
||||
## Useful Commands
|
||||
|
||||
```bash
|
||||
# View logs
|
||||
docker-compose -f docker-compose.dev.yml logs -f
|
||||
|
||||
# View logs for specific service
|
||||
docker-compose -f docker-compose.dev.yml logs -f raggr-backend
|
||||
docker-compose -f docker-compose.dev.yml logs -f raggr-frontend
|
||||
|
||||
# Rebuild after dependency changes
|
||||
docker-compose -f docker-compose.dev.yml up --build
|
||||
|
||||
# Stop all services
|
||||
docker-compose -f docker-compose.dev.yml down
|
||||
|
||||
# Stop and remove volumes (fresh start)
|
||||
docker-compose -f docker-compose.dev.yml down -v
|
||||
```
|
||||
|
||||
## Making Changes
|
||||
|
||||
### Backend Changes
|
||||
1. Edit any Python file in `services/raggr/`
|
||||
2. Save the file
|
||||
3. The Quart server will automatically restart
|
||||
4. Check logs to confirm reload
|
||||
|
||||
### Frontend Changes
|
||||
1. Edit any file in `services/raggr/raggr-frontend/src/`
|
||||
2. Save the file
|
||||
3. The browser will automatically refresh (Hot Module Replacement)
|
||||
4. No need to rebuild
|
||||
|
||||
### Dependency Changes
|
||||
|
||||
**Backend** (pyproject.toml):
|
||||
```bash
|
||||
# Rebuild the backend service
|
||||
docker-compose -f docker-compose.dev.yml up --build raggr-backend
|
||||
```
|
||||
|
||||
**Frontend** (package.json):
|
||||
```bash
|
||||
# Rebuild the frontend service
|
||||
docker-compose -f docker-compose.dev.yml up --build raggr-frontend
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Port Already in Use
|
||||
If you see port binding errors, make sure no other services are running on ports 8080 or 3000.
|
||||
|
||||
### Changes Not Reflected
|
||||
1. Check if the file is properly mounted (check docker-compose.dev.yml volumes)
|
||||
2. Verify the file isn't in an excluded directory (node_modules, __pycache__)
|
||||
3. Check container logs for errors
|
||||
|
||||
### Frontend Not Connecting to Backend
|
||||
Make sure your frontend API calls point to the correct backend URL. If accessing from the same machine, use `http://localhost:8080`. If accessing from another device on the network, use `http://YOUR_IP:8080`.
|
||||
|
||||
## Notes
|
||||
|
||||
- Both services bind to `0.0.0.0` and expose ports, making them accessible on your network
|
||||
- Node modules and Python cache are excluded from volume mounts to use container versions
|
||||
- Database and ChromaDB data persist in Docker volumes across restarts
|
||||
- Access the app from any device on your network using your host machine's IP address
|
||||
@@ -1,15 +0,0 @@
|
||||
import os
|
||||
|
||||
TORTOISE_ORM = {
|
||||
"connections": {"default": os.getenv("DATABASE_URL", "sqlite:///app/database/raggr.db")},
|
||||
"apps": {
|
||||
"models": {
|
||||
"models": [
|
||||
"blueprints.conversation.models",
|
||||
"blueprints.users.models",
|
||||
"aerich.models",
|
||||
],
|
||||
"default_connection": "default",
|
||||
},
|
||||
},
|
||||
}
|
||||
@@ -1,72 +0,0 @@
|
||||
import datetime
|
||||
|
||||
from quart_jwt_extended import (
|
||||
jwt_refresh_token_required,
|
||||
get_jwt_identity,
|
||||
)
|
||||
|
||||
from quart import Blueprint, jsonify
|
||||
from .models import (
|
||||
Conversation,
|
||||
PydConversation,
|
||||
PydListConversation,
|
||||
)
|
||||
|
||||
import blueprints.users.models
|
||||
|
||||
conversation_blueprint = Blueprint(
|
||||
"conversation_api", __name__, url_prefix="/api/conversation"
|
||||
)
|
||||
|
||||
|
||||
@conversation_blueprint.route("/<conversation_id>")
|
||||
async def get_conversation(conversation_id: str):
|
||||
conversation = await Conversation.get(id=conversation_id)
|
||||
await conversation.fetch_related("messages")
|
||||
|
||||
# Manually serialize the conversation with messages
|
||||
messages = []
|
||||
for msg in conversation.messages:
|
||||
messages.append(
|
||||
{
|
||||
"id": str(msg.id),
|
||||
"text": msg.text,
|
||||
"speaker": msg.speaker.value,
|
||||
"created_at": msg.created_at.isoformat(),
|
||||
}
|
||||
)
|
||||
|
||||
return jsonify(
|
||||
{
|
||||
"id": str(conversation.id),
|
||||
"name": conversation.name,
|
||||
"messages": messages,
|
||||
"created_at": conversation.created_at.isoformat(),
|
||||
"updated_at": conversation.updated_at.isoformat(),
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
@conversation_blueprint.post("/")
|
||||
@jwt_refresh_token_required
|
||||
async def create_conversation():
|
||||
user_uuid = get_jwt_identity()
|
||||
user = await blueprints.users.models.User.get(id=user_uuid)
|
||||
conversation = await Conversation.create(
|
||||
name=f"{user.username} {datetime.datetime.now().timestamp}",
|
||||
user=user,
|
||||
)
|
||||
|
||||
serialized_conversation = await PydConversation.from_tortoise_orm(conversation)
|
||||
return jsonify(serialized_conversation.model_dump())
|
||||
|
||||
|
||||
@conversation_blueprint.get("/")
|
||||
@jwt_refresh_token_required
|
||||
async def get_all_conversations():
|
||||
user_uuid = get_jwt_identity()
|
||||
user = await blueprints.users.models.User.get(id=user_uuid)
|
||||
conversations = Conversation.filter(user=user)
|
||||
serialized_conversations = await PydListConversation.from_queryset(conversations)
|
||||
|
||||
return jsonify(serialized_conversations.model_dump())
|
||||
@@ -1,40 +0,0 @@
|
||||
from quart import Blueprint, jsonify, request
|
||||
from quart_jwt_extended import (
|
||||
create_access_token,
|
||||
create_refresh_token,
|
||||
jwt_refresh_token_required,
|
||||
get_jwt_identity,
|
||||
)
|
||||
from .models import User
|
||||
|
||||
|
||||
user_blueprint = Blueprint("user_api", __name__, url_prefix="/api/user")
|
||||
|
||||
|
||||
@user_blueprint.route("/login", methods=["POST"])
|
||||
async def login():
|
||||
data = await request.get_json()
|
||||
username = data.get("username")
|
||||
password = data.get("password")
|
||||
|
||||
user = await User.filter(username=username).first()
|
||||
|
||||
if not user or not user.verify_password(password):
|
||||
return jsonify({"msg": "Invalid credentials"}), 401
|
||||
|
||||
access_token = create_access_token(identity=str(user.id))
|
||||
refresh_token = create_refresh_token(identity=str(user.id))
|
||||
|
||||
return jsonify(
|
||||
access_token=access_token,
|
||||
refresh_token=refresh_token,
|
||||
user={"id": user.id, "username": user.username},
|
||||
)
|
||||
|
||||
|
||||
@user_blueprint.route("/refresh", methods=["POST"])
|
||||
@jwt_refresh_token_required
|
||||
async def refresh():
|
||||
user_id = get_jwt_identity()
|
||||
new_token = create_access_token(identity=user_id)
|
||||
return jsonify(access_token=new_token)
|
||||
13
classifier.py
Normal file
@@ -0,0 +1,13 @@
|
||||
import os
|
||||
|
||||
from llm import LLMClient
|
||||
|
||||
USE_OPENAI = os.getenv("OLLAMA_URL")
|
||||
|
||||
|
||||
class Classifier:
|
||||
def __init__(self):
|
||||
self.llm_client = LLMClient()
|
||||
|
||||
def classify_query_by_action(self, query):
|
||||
_prompt = "Classify the query into one of the following options: "
|
||||
67
docker-compose.dev.yml
Normal file
@@ -0,0 +1,67 @@
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:16-alpine
|
||||
environment:
|
||||
- POSTGRES_USER=raggr
|
||||
- POSTGRES_PASSWORD=raggr_dev_password
|
||||
- POSTGRES_DB=raggr
|
||||
ports:
|
||||
- "5432:5432"
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U raggr"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
raggr:
|
||||
build:
|
||||
context: ./services/raggr
|
||||
dockerfile: Dockerfile.dev
|
||||
image: torrtle/simbarag:dev
|
||||
ports:
|
||||
- "8080:8080"
|
||||
env_file:
|
||||
- .env
|
||||
environment:
|
||||
- PAPERLESS_TOKEN=${PAPERLESS_TOKEN}
|
||||
- BASE_URL=${BASE_URL}
|
||||
- OLLAMA_URL=${OLLAMA_URL:-http://localhost:11434}
|
||||
- CHROMADB_PATH=/app/data/chromadb
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY}
|
||||
- JWT_SECRET_KEY=${JWT_SECRET_KEY}
|
||||
- OIDC_ISSUER=${OIDC_ISSUER}
|
||||
- OIDC_CLIENT_ID=${OIDC_CLIENT_ID}
|
||||
- OIDC_CLIENT_SECRET=${OIDC_CLIENT_SECRET}
|
||||
- OIDC_REDIRECT_URI=${OIDC_REDIRECT_URI}
|
||||
- OIDC_USE_DISCOVERY=${OIDC_USE_DISCOVERY:-true}
|
||||
- DATABASE_URL=postgres://raggr:raggr_dev_password@postgres:5432/raggr
|
||||
- FLASK_ENV=development
|
||||
- PYTHONUNBUFFERED=1
|
||||
- NODE_ENV=development
|
||||
- TAVILY_KEY=${TAVILIY_KEY}
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
volumes:
|
||||
- chromadb_data:/app/data/chromadb
|
||||
develop:
|
||||
watch:
|
||||
# Sync+restart on any file change under services/raggr
|
||||
- action: sync+restart
|
||||
path: ./services/raggr
|
||||
target: /app
|
||||
ignore:
|
||||
- __pycache__/
|
||||
- "*.pyc"
|
||||
- "*.pyo"
|
||||
- "*.pyd"
|
||||
- .git/
|
||||
- chromadb/
|
||||
- node_modules/
|
||||
- raggr-frontend/dist/
|
||||
|
||||
volumes:
|
||||
chromadb_data:
|
||||
postgres_data:
|
||||
@@ -1,19 +1,51 @@
|
||||
version: "3.8"
|
||||
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:16-alpine
|
||||
ports:
|
||||
- "5432:5432"
|
||||
environment:
|
||||
- POSTGRES_USER=${POSTGRES_USER:-raggr}
|
||||
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-changeme}
|
||||
- POSTGRES_DB=${POSTGRES_DB:-raggr}
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-raggr}"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
restart: unless-stopped
|
||||
|
||||
raggr:
|
||||
build:
|
||||
context: ./services/raggr
|
||||
dockerfile: Dockerfile
|
||||
image: torrtle/simbarag:latest
|
||||
network_mode: host
|
||||
ports:
|
||||
- "8080:8080"
|
||||
environment:
|
||||
- PAPERLESS_TOKEN=${PAPERLESS_TOKEN}
|
||||
- BASE_URL=${BASE_URL}
|
||||
- OLLAMA_URL=${OLLAMA_URL:-http://localhost:11434}
|
||||
- CHROMADB_PATH=/app/chromadb
|
||||
- CHROMADB_PATH=/app/data/chromadb
|
||||
- OPENAI_API_KEY=${OPENAI_API_KEY}
|
||||
- JWT_SECRET_KEY=${JWT_SECRET_KEY}
|
||||
- OIDC_ISSUER=${OIDC_ISSUER}
|
||||
- OIDC_CLIENT_ID=${OIDC_CLIENT_ID}
|
||||
- OIDC_CLIENT_SECRET=${OIDC_CLIENT_SECRET}
|
||||
- OIDC_REDIRECT_URI=${OIDC_REDIRECT_URI}
|
||||
- OIDC_USE_DISCOVERY=${OIDC_USE_DISCOVERY:-true}
|
||||
- DATABASE_URL=${DATABASE_URL:-postgres://raggr:changeme@postgres:5432/raggr}
|
||||
- TAVILY_KEY=${TAVILIY_KEY}
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
volumes:
|
||||
- chromadb_data:/app/chromadb
|
||||
- database_data:/app/database
|
||||
- chromadb_data:/app/data/chromadb
|
||||
restart: unless-stopped
|
||||
|
||||
volumes:
|
||||
chromadb_data:
|
||||
database_data:
|
||||
postgres_data:
|
||||
|
||||
@@ -1,63 +0,0 @@
|
||||
from tortoise import BaseDBAsyncClient
|
||||
|
||||
RUN_IN_TRANSACTION = True
|
||||
|
||||
|
||||
async def upgrade(db: BaseDBAsyncClient) -> str:
|
||||
return """
|
||||
CREATE TABLE IF NOT EXISTS "conversations" (
|
||||
"id" CHAR(36) NOT NULL PRIMARY KEY,
|
||||
"name" VARCHAR(255) NOT NULL,
|
||||
"created_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"updated_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
CREATE TABLE IF NOT EXISTS "conversation_messages" (
|
||||
"id" CHAR(36) NOT NULL PRIMARY KEY,
|
||||
"text" TEXT NOT NULL,
|
||||
"created_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"speaker" VARCHAR(10) NOT NULL /* USER: user\nSIMBA: simba */,
|
||||
"conversation_id" CHAR(36) NOT NULL REFERENCES "conversations" ("id") ON DELETE CASCADE
|
||||
);
|
||||
CREATE TABLE IF NOT EXISTS "users" (
|
||||
"id" CHAR(36) NOT NULL PRIMARY KEY,
|
||||
"username" VARCHAR(255) NOT NULL,
|
||||
"password" BLOB NOT NULL,
|
||||
"email" VARCHAR(100) NOT NULL UNIQUE,
|
||||
"created_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"updated_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
CREATE TABLE IF NOT EXISTS "aerich" (
|
||||
"id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
|
||||
"version" VARCHAR(255) NOT NULL,
|
||||
"app" VARCHAR(100) NOT NULL,
|
||||
"content" JSON NOT NULL
|
||||
);"""
|
||||
|
||||
|
||||
async def downgrade(db: BaseDBAsyncClient) -> str:
|
||||
return """
|
||||
"""
|
||||
|
||||
|
||||
MODELS_STATE = (
|
||||
"eJztmG1v4jgQx79KlFddaa9q2W53VZ1OCpTecrvACcLdPtwqMskAVhMnazvboorvfrbJE4"
|
||||
"kJpWq3UPGmhRkPtn8ztv/2nRmEHvjsuBWSn0AZ4jgk5oVxZxIUgPig9b82TBRFuVcaOBr7"
|
||||
"KsAttFQeNGacIpcL5wT5DITJA+ZSHCWdkdj3pTF0RUNMprkpJvhHDA4Pp8BnQIXj23dhxs"
|
||||
"SDW2Dp1+jamWDwvZVxY0/2rewOn0fKNhp1Lq9US9nd2HFDPw5I3jqa81lIsuZxjL1jGSN9"
|
||||
"UyBAEQevMA05ymTaqWk5YmHgNIZsqF5u8GCCYl/CMH+fxMSVDAzVk/xz9oe5BR6BWqLFhE"
|
||||
"sWd4vlrPI5K6spu2p9sAZHb85fqVmGjE+pcioi5kIFIo6WoYprDlL9r6BszRDVo0zbl2CK"
|
||||
"gT4EY2rIOeY1lIJMAT2MmhmgW8cHMuUz8bXx9m0Nxn+sgSIpWimUoajrZdX3Eldj6ZNIc4"
|
||||
"QuBTllB/EqyEvh4TgAPczVyBJSLwk9Tj/sKGAxB69P/HmyCGr42p1ue2hb3b/lTALGfvgK"
|
||||
"kWW3paehrPOS9ei8lIrsR4x/O/YHQ341vvZ77XLtZ+3sr6YcE4p56JDwxkFeYb2m1hTMSm"
|
||||
"LjyHtgYlcjD4l91sSqwcuTZHJd2AKlYYzc6xtEPWfFUzgdgTE0BVZNfzOJvPo4AD87NkuJ"
|
||||
"1hyu3eUv7mbGF2kZp9YivLARrqNXdQWNoGxBRMzbS/qWPdXQ2aBQChDvJ1ScYiIPgmWvBQ"
|
||||
"uHW812bAurHmXafl8ES9022/5sr+ywqSw56lqfX63ssp/6vT/T5gUZ0/rUbx7Uy0s85Krq"
|
||||
"hUWAroHqxX2bxIHKakfgQMSFSnYL4c+8dMzRsD24MGIG9D8y7HSb1oXBcDBG5gNuAKcn97"
|
||||
"gAnJ6s1f/SVVpAxYNmu21eE/qYe/6zblYbtviKHtMDrdK8CingKfkI80r9bpZfO02xoruE"
|
||||
"maKbTEzoykV8EJMEvlzY1rBlXbbNxXpt+5RKbsSUJKpIN2Wv1WpyaR+02f5rM5nHbR+Uij"
|
||||
"H7otF+waNShBi7CammMpuYIDrXwyxGlWCO53x5/9k9nDX0mlKwFvWWYNbs9KzBF73mTdsX"
|
||||
"C7f5xW5bJbwQIOxvU6ZZwOPU6OYl/5gVenpyP9VTJ3uquudwcXiZF4fDs+eLSOy2z55PKQ"
|
||||
"0toNid6cRh4qmVhyhvszP6sEPWvDdp5aHU9KVqTxL2rIeEemr9rXF69u7s/Zvzs/eiiRpJ"
|
||||
"ZnlXU/2dnr1BDsrLivYOt/6YLYQcxGAGUi6NLSAmzfcT4NNolZBwIJrz7K9hv7f2bSYNKY"
|
||||
"EcETHBbx52+WvDx4x/302sNRTlrOsfkstvxqXDSP5AU/eK8yuPl8X/Etg7Fw=="
|
||||
)
|
||||
@@ -1,60 +0,0 @@
|
||||
from tortoise import BaseDBAsyncClient
|
||||
|
||||
RUN_IN_TRANSACTION = True
|
||||
|
||||
|
||||
async def upgrade(db: BaseDBAsyncClient) -> str:
|
||||
return """
|
||||
-- SQLite doesn't support ADD CONSTRAINT, so we need to recreate the table
|
||||
CREATE TABLE "conversations_new" (
|
||||
"id" CHAR(36) NOT NULL PRIMARY KEY,
|
||||
"name" VARCHAR(255) NOT NULL,
|
||||
"created_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"updated_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"user_id" CHAR(36),
|
||||
FOREIGN KEY ("user_id") REFERENCES "users" ("id") ON DELETE CASCADE
|
||||
);
|
||||
INSERT INTO "conversations_new" ("id", "name", "created_at", "updated_at")
|
||||
SELECT "id", "name", "created_at", "updated_at" FROM "conversations";
|
||||
DROP TABLE "conversations";
|
||||
ALTER TABLE "conversations_new" RENAME TO "conversations";"""
|
||||
|
||||
|
||||
async def downgrade(db: BaseDBAsyncClient) -> str:
|
||||
return """
|
||||
-- Recreate table without user_id column
|
||||
CREATE TABLE "conversations_new" (
|
||||
"id" CHAR(36) NOT NULL PRIMARY KEY,
|
||||
"name" VARCHAR(255) NOT NULL,
|
||||
"created_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"updated_at" TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
INSERT INTO "conversations_new" ("id", "name", "created_at", "updated_at")
|
||||
SELECT "id", "name", "created_at", "updated_at" FROM "conversations";
|
||||
DROP TABLE "conversations";
|
||||
ALTER TABLE "conversations_new" RENAME TO "conversations";"""
|
||||
|
||||
|
||||
MODELS_STATE = (
|
||||
"eJztmWtP2zAUhv9KlE8gbQg6xhCaJqWlbB20ndp0F9gUuYnbWiROiJ1Bhfjvs91cnMRNKe"
|
||||
"PSon6B9vic2H5s57w+vdU934Eu2Wn4+C8MCaDIx/qRdqtj4EH2Qdn+RtNBEGSt3EDB0BUB"
|
||||
"tuQpWsCQ0BDYlDWOgEsgMzmQ2CEK4s5w5Lrc6NvMEeFxZoowuoqgRf0xpBMYsoaLP8yMsA"
|
||||
"NvIEm+BpfWCEHXyY0bObxvYbfoNBC2waB1fCI8eXdDy/bdyMOZdzClEx+n7lGEnB0ew9vG"
|
||||
"EMMQUOhI0+CjjKedmGYjZgYaRjAdqpMZHDgCkcth6B9HEbY5A030xP/sf9KXwMNQc7QIU8"
|
||||
"7i9m42q2zOwqrzrhpfjN7Wu4NtMUuf0HEoGgUR/U4EAgpmoYJrBlL8L6FsTECoRpn4F2Cy"
|
||||
"gT4EY2LIOGZ7KAGZAHoYNd0DN5YL8ZhO2Nfa+/cVGL8bPUGSeQmUPtvXs13fiZtqszaONE"
|
||||
"Noh5BP2QK0DPKYtVDkQTXMfGQBqROH7iQfVhQwm4PTxe40PgQVfM1Wu9k3jfY3PhOPkCtX"
|
||||
"IDLMJm+pCeu0YN06KCxF+hDtR8v8ovGv2nm30yzu/dTPPNf5mEBEfQv71xZwpPOaWBMwuY"
|
||||
"WNAueBC5uP3Czsiy5sPHhpXQkMreUyiBTyH2kkHtszLuLDkwZPvaNLZc7gMMrwTvwQojE+"
|
||||
"hVOBsMXGAbCtShax6BjEj1lVaJk1G0UIrlM1Im8KNjs2J0hn2dPoN4zjpi4YDoF9eQ1Cx5"
|
||||
"oD04OEgDEkZaD1OPLktAfdVJqpWcoCrj174mq+VeaxFaz8mi8xytErN3k1r2gBmM3bifvm"
|
||||
"PVXQWaCCJYj3E8OWvJAbUbzWopjCG0XKN5lVjTLxXxdRXJXKmz/NXBZPpO9W2/i5ncvkZ9"
|
||||
"3O58RdksqNs259o5Bfo5AqK2QSQHCpEgP8AtnEkVeSArnVlcJf+Ojog36zd6TxjP4b91vt"
|
||||
"unGkEeQNgX6/Jc7dMvd273HJ3Nude8fkTYUDJCea5V7zitDHfOevqYS1CwWv/5SyxfrZyl"
|
||||
"JcqGkV22VZbfuUSk7cGRTSLblLzNdq/GhvtNn6azO+jssWLeWYddFoz1C4DAAh136o2Jl1"
|
||||
"hEE4VcOUowowh1M6u/+sHs4KenUuWGW9xZjVWx2j90uteRN/eePWf5lNo4AXegC5y2zTNO"
|
||||
"Bx9ujiI/+YO3Rv936qp0r2lHXP5uLwOi8Om9L6q1jYtHJXEoCLyp6l35Efp/a5VvXkJ615"
|
||||
"GjBE9kRXaOW4pVItg8xnZeRyC88pvynVMsdc2Azxyr9ozhSV57e1vf0P+4fvDvYPmYsYSW"
|
||||
"r5UPEyaHXMBeqYHwTllXa+6pBCNto4BcmPxhIQY/f1BPg00s3HFGJFev/a73bmlqqSkALI"
|
||||
"AWYTvHCQTd9oLiL0z2piraDIZ11dVy+W0Au5mT+gripqPWch5u4f/FVgYA=="
|
||||
)
|
||||
@@ -1,12 +0,0 @@
|
||||
[project]
|
||||
name = "raggr"
|
||||
version = "0.1.0"
|
||||
description = "Add your description here"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.13"
|
||||
dependencies = ["chromadb>=1.1.0", "python-dotenv>=1.0.0", "flask>=3.1.2", "httpx>=0.28.1", "ollama>=0.6.0", "openai>=2.0.1", "pydantic>=2.11.9", "pillow>=10.0.0", "pymupdf>=1.24.0", "black>=25.9.0", "pillow-heif>=1.1.1", "flask-jwt-extended>=4.7.1", "bcrypt>=5.0.0", "pony>=0.7.19", "flask-login>=0.6.3", "quart>=0.20.0", "tortoise-orm>=0.25.1", "quart-jwt-extended>=0.1.0", "pre-commit>=4.3.0", "tortoise-orm-stubs>=1.0.2", "aerich>=0.8.0", "tomlkit>=0.13.3"]
|
||||
|
||||
[tool.aerich]
|
||||
tortoise_orm = "app.TORTOISE_CONFIG"
|
||||
location = "./migrations"
|
||||
src_folder = "./."
|
||||
@@ -1,228 +0,0 @@
|
||||
import { useEffect, useState } from "react";
|
||||
import { conversationService } from "../api/conversationService";
|
||||
import { QuestionBubble } from "./QuestionBubble";
|
||||
import { AnswerBubble } from "./AnswerBubble";
|
||||
import { ConversationList } from "./ConversationList";
|
||||
import { parse } from "node:path/win32";
|
||||
|
||||
type Message = {
|
||||
text: string;
|
||||
speaker: "simba" | "user";
|
||||
};
|
||||
|
||||
type QuestionAnswer = {
|
||||
question: string;
|
||||
answer: string;
|
||||
};
|
||||
|
||||
type Conversation = {
|
||||
title: string;
|
||||
id: string;
|
||||
};
|
||||
|
||||
type ChatScreenProps = {
|
||||
setAuthenticated: (isAuth: boolean) => void;
|
||||
};
|
||||
|
||||
export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
|
||||
const [query, setQuery] = useState<string>("");
|
||||
const [answer, setAnswer] = useState<string>("");
|
||||
const [simbaMode, setSimbaMode] = useState<boolean>(false);
|
||||
const [questionsAnswers, setQuestionsAnswers] = useState<QuestionAnswer[]>(
|
||||
[],
|
||||
);
|
||||
const [messages, setMessages] = useState<Message[]>([]);
|
||||
const [conversations, setConversations] = useState<Conversation[]>([
|
||||
{ title: "simba meow meow", id: "uuid" },
|
||||
]);
|
||||
const [showConversations, setShowConversations] = useState<boolean>(false);
|
||||
const [selectedConversation, setSelectedConversation] =
|
||||
useState<Conversation | null>(null);
|
||||
|
||||
const simbaAnswers = ["meow.", "hiss...", "purrrrrr", "yowOWROWWowowr"];
|
||||
|
||||
const handleSelectConversation = (conversation: Conversation) => {
|
||||
setShowConversations(false);
|
||||
setSelectedConversation(conversation);
|
||||
const loadMessages = async () => {
|
||||
try {
|
||||
const fetchedConversation = await conversationService.getConversation(
|
||||
conversation.id,
|
||||
);
|
||||
setMessages(
|
||||
fetchedConversation.messages.map((message) => ({
|
||||
text: message.text,
|
||||
speaker: message.speaker,
|
||||
})),
|
||||
);
|
||||
} catch (error) {
|
||||
console.error("Failed to load messages:", error);
|
||||
}
|
||||
};
|
||||
loadMessages();
|
||||
};
|
||||
|
||||
const loadConversations = async () => {
|
||||
try {
|
||||
const fetchedConversations =
|
||||
await conversationService.getAllConversations();
|
||||
const parsedConversations = fetchedConversations.map((conversation) => ({
|
||||
id: conversation.id,
|
||||
title: conversation.name,
|
||||
}));
|
||||
setConversations(parsedConversations);
|
||||
setSelectedConversation(parsedConversations[0]);
|
||||
console.log(parsedConversations);
|
||||
} catch (error) {
|
||||
console.error("Failed to load messages:", error);
|
||||
}
|
||||
};
|
||||
|
||||
const handleCreateNewConversation = async () => {
|
||||
const newConversation = await conversationService.createConversation();
|
||||
await loadConversations();
|
||||
setSelectedConversation({
|
||||
title: newConversation.name,
|
||||
id: newConversation.id,
|
||||
});
|
||||
};
|
||||
|
||||
useEffect(() => {
|
||||
loadConversations();
|
||||
}, []);
|
||||
|
||||
useEffect(() => {
|
||||
const loadMessages = async () => {
|
||||
if (selectedConversation == null) return;
|
||||
try {
|
||||
const conversation = await conversationService.getConversation(
|
||||
selectedConversation.id,
|
||||
);
|
||||
setMessages(
|
||||
conversation.messages.map((message) => ({
|
||||
text: message.text,
|
||||
speaker: message.speaker,
|
||||
})),
|
||||
);
|
||||
} catch (error) {
|
||||
console.error("Failed to load messages:", error);
|
||||
}
|
||||
};
|
||||
loadMessages();
|
||||
}, [selectedConversation]);
|
||||
|
||||
const handleQuestionSubmit = async () => {
|
||||
const currMessages = messages.concat([{ text: query, speaker: "user" }]);
|
||||
setMessages(currMessages);
|
||||
|
||||
if (simbaMode) {
|
||||
console.log("simba mode activated");
|
||||
const randomIndex = Math.floor(Math.random() * simbaAnswers.length);
|
||||
const randomElement = simbaAnswers[randomIndex];
|
||||
setAnswer(randomElement);
|
||||
setQuestionsAnswers(
|
||||
questionsAnswers.concat([
|
||||
{
|
||||
question: query,
|
||||
answer: randomElement,
|
||||
},
|
||||
]),
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
const result = await conversationService.sendQuery(
|
||||
query,
|
||||
selectedConversation.id,
|
||||
);
|
||||
setQuestionsAnswers(
|
||||
questionsAnswers.concat([{ question: query, answer: result.response }]),
|
||||
);
|
||||
setMessages(
|
||||
currMessages.concat([{ text: result.response, speaker: "simba" }]),
|
||||
);
|
||||
setQuery(""); // Clear input after successful send
|
||||
} catch (error) {
|
||||
console.error("Failed to send query:", error);
|
||||
// If session expired, redirect to login
|
||||
if (error instanceof Error && error.message.includes("Session expired")) {
|
||||
setAuthenticated(false);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
const handleQueryChange = (event: React.ChangeEvent<HTMLTextAreaElement>) => {
|
||||
setQuery(event.target.value);
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="h-screen bg-opacity-20">
|
||||
<div className="bg-white/85 h-screen">
|
||||
<div className="flex flex-row justify-center py-4">
|
||||
<div className="flex flex-col gap-4 min-w-xl max-w-xl">
|
||||
<div className="flex flex-row justify-between">
|
||||
<header className="flex flex-row justify-center gap-2 sticky top-0 z-10 bg-white">
|
||||
<h1 className="text-3xl">ask simba!</h1>
|
||||
</header>
|
||||
<div className="flex flex-row gap-2">
|
||||
<button
|
||||
className="p-2 border border-green-400 bg-green-200 hover:bg-green-400 cursor-pointer rounded-md"
|
||||
onClick={() => setShowConversations(!showConversations)}
|
||||
>
|
||||
{showConversations
|
||||
? "hide conversations"
|
||||
: "show conversations"}
|
||||
</button>
|
||||
<button
|
||||
className="p-2 border border-red-400 bg-red-200 hover:bg-red-400 cursor-pointer rounded-md"
|
||||
onClick={() => setAuthenticated(false)}
|
||||
>
|
||||
logout
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
{showConversations && (
|
||||
<ConversationList
|
||||
conversations={conversations}
|
||||
onCreateNewConversation={handleCreateNewConversation}
|
||||
onSelectConversation={handleSelectConversation}
|
||||
/>
|
||||
)}
|
||||
{messages.map((msg, index) => {
|
||||
if (msg.speaker === "simba") {
|
||||
return <AnswerBubble key={index} text={msg.text} />;
|
||||
}
|
||||
return <QuestionBubble key={index} text={msg.text} />;
|
||||
})}
|
||||
<footer className="flex flex-col gap-2 sticky bottom-0">
|
||||
<div className="flex flex-row justify-between gap-2 grow">
|
||||
<textarea
|
||||
className="p-4 border border-blue-200 rounded-md grow bg-white"
|
||||
onChange={handleQueryChange}
|
||||
value={query}
|
||||
/>
|
||||
</div>
|
||||
<div className="flex flex-row justify-between gap-2 grow">
|
||||
<button
|
||||
className="p-4 border border-blue-400 bg-blue-200 hover:bg-blue-400 cursor-pointer rounded-md flex-grow"
|
||||
onClick={() => handleQuestionSubmit()}
|
||||
type="submit"
|
||||
>
|
||||
Submit
|
||||
</button>
|
||||
</div>
|
||||
<div className="flex flex-row justify-center gap-2 grow">
|
||||
<input
|
||||
type="checkbox"
|
||||
onChange={(event) => setSimbaMode(event.target.checked)}
|
||||
/>
|
||||
<p>simba mode?</p>
|
||||
</div>
|
||||
</footer>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
@@ -1,80 +0,0 @@
|
||||
import { useState } from "react";
|
||||
import { userService } from "../api/userService";
|
||||
|
||||
type LoginScreenProps = {
|
||||
setAuthenticated: (isAuth: boolean) => void;
|
||||
};
|
||||
|
||||
export const LoginScreen = ({ setAuthenticated }: LoginScreenProps) => {
|
||||
const [username, setUsername] = useState<string>("");
|
||||
const [password, setPassword] = useState<string>("");
|
||||
const [error, setError] = useState<string>("");
|
||||
|
||||
const handleLogin = async () => {
|
||||
if (!username || !password) {
|
||||
setError("Please enter username and password");
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
const result = await userService.login(username, password);
|
||||
localStorage.setItem("access_token", result.access_token);
|
||||
localStorage.setItem("refresh_token", result.refresh_token);
|
||||
setAuthenticated(true);
|
||||
setError("");
|
||||
} catch (err) {
|
||||
setError("Login failed. Please check your credentials.");
|
||||
console.error("Login error:", err);
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="h-screen bg-opacity-20">
|
||||
<div className="bg-white/85 h-screen">
|
||||
<div className="flex flex-row justify-center py-4">
|
||||
<div className="flex flex-col gap-4 min-w-xl max-w-xl">
|
||||
<div className="flex flex-col gap-1">
|
||||
<div className="flex flex-grow justify-center w-full bg-amber-400">
|
||||
<h1 className="text-xl font-bold">
|
||||
I AM LOOKING FOR A DESIGNER. THIS APP WILL REMAIN UGLY UNTIL A
|
||||
DESIGNER COMES.
|
||||
</h1>
|
||||
</div>
|
||||
<header className="flex flex-row justify-center gap-2 grow sticky top-0 z-10 bg-white">
|
||||
<h1 className="text-3xl">ask simba!</h1>
|
||||
</header>
|
||||
<label htmlFor="username">username</label>
|
||||
<input
|
||||
type="text"
|
||||
id="username"
|
||||
name="username"
|
||||
value={username}
|
||||
onChange={(e) => setUsername(e.target.value)}
|
||||
className="border border-s-slate-950 p-3 rounded-md"
|
||||
/>
|
||||
<label htmlFor="password">password</label>
|
||||
<input
|
||||
type="password"
|
||||
id="password"
|
||||
name="password"
|
||||
value={password}
|
||||
onChange={(e) => setPassword(e.target.value)}
|
||||
className="border border-s-slate-950 p-3 rounded-md"
|
||||
/>
|
||||
{error && (
|
||||
<div className="text-red-600 font-semibold">{error}</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
<button
|
||||
className="p-4 border border-blue-400 bg-blue-200 hover:bg-blue-400 cursor-pointer rounded-md flex-grow"
|
||||
onClick={handleLogin}
|
||||
>
|
||||
login
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
@@ -1,7 +0,0 @@
|
||||
type QuestionBubbleProps = {
|
||||
text: string;
|
||||
};
|
||||
|
||||
export const QuestionBubble = ({ text }: QuestionBubbleProps) => {
|
||||
return <div className="rounded-md bg-stone-200 p-3">🤦: {text}</div>;
|
||||
};
|
||||
@@ -24,7 +24,6 @@ RUN uv pip install --system -e .
|
||||
# Copy application code
|
||||
COPY *.py ./
|
||||
COPY blueprints ./blueprints
|
||||
COPY aerich.toml ./
|
||||
COPY migrations ./migrations
|
||||
COPY startup.sh ./
|
||||
RUN chmod +x startup.sh
|
||||
53
services/raggr/Dockerfile.dev
Normal file
@@ -0,0 +1,53 @@
|
||||
FROM python:3.13-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install system dependencies, Node.js, uv, and yarn
|
||||
RUN apt-get update && apt-get install -y \
|
||||
build-essential \
|
||||
curl \
|
||||
&& curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
|
||||
&& apt-get install -y nodejs \
|
||||
&& npm install -g yarn \
|
||||
&& rm -rf /var/lib/apt/lists/* \
|
||||
&& curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
|
||||
# Add uv to PATH
|
||||
ENV PATH="/root/.local/bin:$PATH"
|
||||
|
||||
# Copy dependency files
|
||||
COPY pyproject.toml ./
|
||||
|
||||
# Install Python dependencies using uv
|
||||
RUN uv pip install --system -e .
|
||||
|
||||
# Copy frontend package files and install dependencies
|
||||
COPY raggr-frontend/package.json raggr-frontend/yarn.lock* raggr-frontend/
|
||||
WORKDIR /app/raggr-frontend
|
||||
RUN yarn install
|
||||
|
||||
# Copy application source code
|
||||
WORKDIR /app
|
||||
COPY . .
|
||||
|
||||
# Build frontend
|
||||
WORKDIR /app/raggr-frontend
|
||||
RUN yarn build
|
||||
|
||||
# Create ChromaDB and database directories
|
||||
WORKDIR /app
|
||||
RUN mkdir -p /app/chromadb /app/database
|
||||
|
||||
# Make startup script executable
|
||||
RUN chmod +x /app/startup-dev.sh
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONPATH=/app
|
||||
ENV CHROMADB_PATH=/app/chromadb
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
|
||||
# Expose port
|
||||
EXPOSE 8080
|
||||
|
||||
# Default command
|
||||
CMD ["/app/startup-dev.sh"]
|
||||
97
services/raggr/VECTORSTORE.md
Normal file
@@ -0,0 +1,97 @@
|
||||
# Vector Store Management
|
||||
|
||||
This document describes how to manage the ChromaDB vector store used for RAG (Retrieval-Augmented Generation).
|
||||
|
||||
## Configuration
|
||||
|
||||
The vector store location is controlled by the `CHROMADB_PATH` environment variable:
|
||||
|
||||
- **Development (local)**: Set in `.env` to a local path (e.g., `/path/to/chromadb`)
|
||||
- **Docker**: Automatically set to `/app/data/chromadb` and persisted via Docker volume
|
||||
|
||||
## Management Commands
|
||||
|
||||
### CLI (Command Line)
|
||||
|
||||
Use the `manage_vectorstore.py` script for vector store operations:
|
||||
|
||||
```bash
|
||||
# Show statistics
|
||||
python manage_vectorstore.py stats
|
||||
|
||||
# Index documents from Paperless-NGX (incremental)
|
||||
python manage_vectorstore.py index
|
||||
|
||||
# Clear and reindex all documents
|
||||
python manage_vectorstore.py reindex
|
||||
|
||||
# List documents
|
||||
python manage_vectorstore.py list 10
|
||||
python manage_vectorstore.py list 20 --show-content
|
||||
```
|
||||
|
||||
### Docker
|
||||
|
||||
Run commands inside the Docker container:
|
||||
|
||||
```bash
|
||||
# Show statistics
|
||||
docker compose -f docker-compose.dev.yml exec -T raggr python manage_vectorstore.py stats
|
||||
|
||||
# Reindex all documents
|
||||
docker compose -f docker-compose.dev.yml exec -T raggr python manage_vectorstore.py reindex
|
||||
```
|
||||
|
||||
### API Endpoints
|
||||
|
||||
The following authenticated endpoints are available:
|
||||
|
||||
- `GET /api/rag/stats` - Get vector store statistics
|
||||
- `POST /api/rag/index` - Trigger indexing of new documents
|
||||
- `POST /api/rag/reindex` - Clear and reindex all documents
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Document Fetching**: Documents are fetched from Paperless-NGX via the API
|
||||
2. **Chunking**: Documents are split into chunks of ~1000 characters with 200 character overlap
|
||||
3. **Embedding**: Chunks are embedded using OpenAI's `text-embedding-3-large` model
|
||||
4. **Storage**: Embeddings are stored in ChromaDB with metadata (filename, document type, date)
|
||||
5. **Retrieval**: User queries are embedded and similar chunks are retrieved for RAG
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Error creating hnsw segment reader"
|
||||
|
||||
This indicates a corrupted index. Solution:
|
||||
|
||||
```bash
|
||||
python manage_vectorstore.py reindex
|
||||
```
|
||||
|
||||
### Empty results
|
||||
|
||||
Check if documents are indexed:
|
||||
|
||||
```bash
|
||||
python manage_vectorstore.py stats
|
||||
```
|
||||
|
||||
If count is 0, run:
|
||||
|
||||
```bash
|
||||
python manage_vectorstore.py index
|
||||
```
|
||||
|
||||
### Different results in Docker vs local
|
||||
|
||||
Docker and local environments use separate ChromaDB instances. To sync:
|
||||
|
||||
1. Index inside Docker: `docker compose exec -T raggr python manage_vectorstore.py reindex`
|
||||
2. Or mount the same volume for both environments
|
||||
|
||||
## Production Considerations
|
||||
|
||||
1. **Volume Persistence**: Use Docker volumes or persistent storage for ChromaDB
|
||||
2. **Backup**: Regularly backup the ChromaDB data directory
|
||||
3. **Reindexing**: Schedule periodic reindexing to keep data fresh
|
||||
4. **Monitoring**: Monitor the `/api/rag/stats` endpoint for document counts
|
||||
@@ -1,16 +1,27 @@
|
||||
# GENERATED BY CLAUDE
|
||||
|
||||
import os
|
||||
import sys
|
||||
import uuid
|
||||
import asyncio
|
||||
from tortoise import Tortoise
|
||||
from blueprints.users.models import User
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
load_dotenv()
|
||||
|
||||
# Database configuration with environment variable support
|
||||
DATABASE_PATH = os.getenv("DATABASE_PATH", "database/raggr.db")
|
||||
DATABASE_URL = os.getenv("DATABASE_URL", f"sqlite://{DATABASE_PATH}")
|
||||
|
||||
print(DATABASE_URL)
|
||||
|
||||
|
||||
async def add_user(username: str, email: str, password: str):
|
||||
"""Add a new user to the database"""
|
||||
await Tortoise.init(
|
||||
db_url="sqlite://database/raggr.db",
|
||||
db_url=DATABASE_URL,
|
||||
modules={
|
||||
"models": [
|
||||
"blueprints.users.models",
|
||||
@@ -56,7 +67,7 @@ async def add_user(username: str, email: str, password: str):
|
||||
async def list_users():
|
||||
"""List all users in the database"""
|
||||
await Tortoise.init(
|
||||
db_url="sqlite://database/raggr.db",
|
||||
db_url=DATABASE_URL,
|
||||
modules={
|
||||
"models": [
|
||||
"blueprints.users.models",
|
||||
@@ -94,6 +105,11 @@ def print_usage():
|
||||
print("\nExamples:")
|
||||
print(" python add_user.py add ryan ryan@example.com mypassword123")
|
||||
print(" python add_user.py list")
|
||||
print("\nEnvironment Variables:")
|
||||
print(" DATABASE_PATH - Path to database file (default: database/raggr.db)")
|
||||
print(" DATABASE_URL - Full database URL (overrides DATABASE_PATH)")
|
||||
print("\n Example with custom database:")
|
||||
print(" DATABASE_PATH=dev.db python add_user.py list")
|
||||
|
||||
|
||||
async def main():
|
||||
20
services/raggr/aerich_config.py
Normal file
@@ -0,0 +1,20 @@
|
||||
import os
|
||||
|
||||
# Database configuration with environment variable support
|
||||
# Use DATABASE_PATH for relative paths or DATABASE_URL for full connection strings
|
||||
DATABASE_PATH = os.getenv("DATABASE_PATH", "database/raggr.db")
|
||||
DATABASE_URL = os.getenv("DATABASE_URL", f"sqlite://{DATABASE_PATH}")
|
||||
|
||||
TORTOISE_ORM = {
|
||||
"connections": {"default": DATABASE_URL},
|
||||
"apps": {
|
||||
"models": {
|
||||
"models": [
|
||||
"blueprints.conversation.models",
|
||||
"blueprints.users.models",
|
||||
"aerich.models",
|
||||
],
|
||||
"default_connection": "default",
|
||||
},
|
||||
},
|
||||
}
|
||||
@@ -1,16 +1,15 @@
|
||||
import os
|
||||
|
||||
from quart import Quart, request, jsonify, render_template, send_from_directory
|
||||
from quart import Quart, jsonify, render_template, request, send_from_directory
|
||||
from quart_jwt_extended import JWTManager, get_jwt_identity, jwt_refresh_token_required
|
||||
from tortoise.contrib.quart import register_tortoise
|
||||
|
||||
from quart_jwt_extended import JWTManager, jwt_refresh_token_required, get_jwt_identity
|
||||
|
||||
from main import consult_simba_oracle
|
||||
|
||||
import blueprints.users
|
||||
import blueprints.conversation
|
||||
import blueprints.conversation.logic
|
||||
import blueprints.rag
|
||||
import blueprints.users
|
||||
import blueprints.users.models
|
||||
from main import consult_simba_oracle
|
||||
|
||||
app = Quart(
|
||||
__name__,
|
||||
@@ -24,10 +23,16 @@ jwt = JWTManager(app)
|
||||
# Register blueprints
|
||||
app.register_blueprint(blueprints.users.user_blueprint)
|
||||
app.register_blueprint(blueprints.conversation.conversation_blueprint)
|
||||
app.register_blueprint(blueprints.rag.rag_blueprint)
|
||||
|
||||
|
||||
# Database configuration with environment variable support
|
||||
DATABASE_URL = os.getenv(
|
||||
"DATABASE_URL", "postgres://raggr:raggr_dev_password@localhost:5432/raggr"
|
||||
)
|
||||
|
||||
TORTOISE_CONFIG = {
|
||||
"connections": {"default": "sqlite://database/raggr.db"},
|
||||
"connections": {"default": DATABASE_URL},
|
||||
"apps": {
|
||||
"models": {
|
||||
"models": [
|
||||
@@ -119,10 +124,17 @@ async def get_messages():
|
||||
}
|
||||
)
|
||||
|
||||
name = conversation.name
|
||||
if len(messages) > 8:
|
||||
name = await blueprints.conversation.logic.rename_conversation(
|
||||
user=user,
|
||||
conversation=conversation,
|
||||
)
|
||||
|
||||
return jsonify(
|
||||
{
|
||||
"id": str(conversation.id),
|
||||
"name": conversation.name,
|
||||
"name": name,
|
||||
"messages": messages,
|
||||
"created_at": conversation.created_at.isoformat(),
|
||||
"updated_at": conversation.updated_at.isoformat(),
|
||||
172
services/raggr/blueprints/conversation/__init__.py
Normal file
@@ -0,0 +1,172 @@
|
||||
import datetime
|
||||
|
||||
from quart import Blueprint, jsonify, request
|
||||
from quart_jwt_extended import (
|
||||
get_jwt_identity,
|
||||
jwt_refresh_token_required,
|
||||
)
|
||||
|
||||
import blueprints.users.models
|
||||
|
||||
from .agents import main_agent
|
||||
from .logic import (
|
||||
add_message_to_conversation,
|
||||
get_conversation_by_id,
|
||||
rename_conversation,
|
||||
)
|
||||
from .models import (
|
||||
Conversation,
|
||||
PydConversation,
|
||||
PydListConversation,
|
||||
)
|
||||
|
||||
conversation_blueprint = Blueprint(
|
||||
"conversation_api", __name__, url_prefix="/api/conversation"
|
||||
)
|
||||
|
||||
|
||||
@conversation_blueprint.post("/query")
|
||||
@jwt_refresh_token_required
|
||||
async def query():
|
||||
current_user_uuid = get_jwt_identity()
|
||||
user = await blueprints.users.models.User.get(id=current_user_uuid)
|
||||
data = await request.get_json()
|
||||
query = data.get("query")
|
||||
conversation_id = data.get("conversation_id")
|
||||
conversation = await get_conversation_by_id(conversation_id)
|
||||
await conversation.fetch_related("messages")
|
||||
await add_message_to_conversation(
|
||||
conversation=conversation,
|
||||
message=query,
|
||||
speaker="user",
|
||||
user=user,
|
||||
)
|
||||
|
||||
# Build conversation history from recent messages (last 10 for context)
|
||||
recent_messages = (
|
||||
conversation.messages[-10:]
|
||||
if len(conversation.messages) > 10
|
||||
else conversation.messages
|
||||
)
|
||||
|
||||
messages_payload = [
|
||||
{
|
||||
"role": "system",
|
||||
"content": """You are a helpful cat assistant named Simba that understands veterinary terms. When there are questions to you specifically, they are referring to Simba the cat. Answer the user in as if you were a cat named Simba. Don't act too catlike. Be assertive.
|
||||
|
||||
SIMBA FACTS (as of January 2026):
|
||||
- Name: Simba
|
||||
- Species: Feline (Domestic Short Hair / American Short Hair)
|
||||
- Sex: Male, Neutered
|
||||
- Date of Birth: August 8, 2016 (approximately 9 years 5 months old)
|
||||
- Color: Orange
|
||||
- Current Weight: 16 lbs (as of 1/8/2026)
|
||||
- Owner: Ryan Chen
|
||||
- Location: Long Island City, NY
|
||||
- Veterinarian: Court Square Animal Hospital
|
||||
|
||||
Medical Conditions:
|
||||
- Hypertrophic Cardiomyopathy (HCM): Diagnosed 12/11/2025. Concentric left ventricular hypertrophy with no left atrial dilation. Grade II-III/VI systolic heart murmur. No cardiac medications currently needed. Must avoid Domitor, acepromazine, and ketamine during anesthesia.
|
||||
- Dental Issues: Prior extraction of teeth 307 and 407 due to resorption. Tooth 107 extracted on 1/8/2026. Early resorption lesions present on teeth 207, 309, and 409.
|
||||
|
||||
Recent Medical Events:
|
||||
- 1/8/2026: Dental cleaning and tooth 107 extraction. Prescribed Onsior for 3 days. Oravet sealant applied.
|
||||
- 12/11/2025: Echocardiogram confirming HCM diagnosis. Pre-op bloodwork was normal.
|
||||
- 12/1/2025: Visited for decreased appetite/nausea. Received subcutaneous fluids and Cerenia.
|
||||
|
||||
Diet & Lifestyle:
|
||||
- Diet: Hill's I/D wet and dry food
|
||||
- Supplements: Plaque Off
|
||||
- Indoor only cat, only pet in the household
|
||||
|
||||
Upcoming Appointments:
|
||||
- Rabies Vaccine: Due 2/19/2026
|
||||
- Routine Examination: Due 6/1/2026
|
||||
- FVRCP-3yr Vaccine: Due 10/2/2026
|
||||
|
||||
IMPORTANT: When users ask factual questions about Simba's health, medical history, veterinary visits, medications, weight, or any information that would be in documents, you MUST use the simba_search tool to retrieve accurate information before answering. Do not rely on general knowledge - always search the documents for factual questions.""",
|
||||
}
|
||||
]
|
||||
|
||||
# Add recent conversation history
|
||||
for msg in recent_messages[:-1]: # Exclude the message we just added
|
||||
role = "user" if msg.speaker == "user" else "assistant"
|
||||
messages_payload.append({"role": role, "content": msg.text})
|
||||
|
||||
# Add current query
|
||||
messages_payload.append({"role": "user", "content": query})
|
||||
|
||||
payload = {"messages": messages_payload}
|
||||
|
||||
response = await main_agent.ainvoke(payload)
|
||||
message = response.get("messages", [])[-1].content
|
||||
await add_message_to_conversation(
|
||||
conversation=conversation,
|
||||
message=message,
|
||||
speaker="simba",
|
||||
user=user,
|
||||
)
|
||||
return jsonify({"response": message})
|
||||
|
||||
|
||||
@conversation_blueprint.route("/<conversation_id>")
|
||||
@jwt_refresh_token_required
|
||||
async def get_conversation(conversation_id: str):
|
||||
conversation = await Conversation.get(id=conversation_id)
|
||||
current_user_uuid = get_jwt_identity()
|
||||
user = await blueprints.users.models.User.get(id=current_user_uuid)
|
||||
await conversation.fetch_related("messages")
|
||||
|
||||
# Manually serialize the conversation with messages
|
||||
messages = []
|
||||
for msg in conversation.messages:
|
||||
messages.append(
|
||||
{
|
||||
"id": str(msg.id),
|
||||
"text": msg.text,
|
||||
"speaker": msg.speaker.value,
|
||||
"created_at": msg.created_at.isoformat(),
|
||||
}
|
||||
)
|
||||
name = conversation.name
|
||||
if len(messages) > 8 and "datetime" in name.lower():
|
||||
name = await rename_conversation(
|
||||
user=user,
|
||||
conversation=conversation,
|
||||
)
|
||||
print(name)
|
||||
|
||||
return jsonify(
|
||||
{
|
||||
"id": str(conversation.id),
|
||||
"name": name,
|
||||
"messages": messages,
|
||||
"created_at": conversation.created_at.isoformat(),
|
||||
"updated_at": conversation.updated_at.isoformat(),
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
@conversation_blueprint.post("/")
|
||||
@jwt_refresh_token_required
|
||||
async def create_conversation():
|
||||
user_uuid = get_jwt_identity()
|
||||
user = await blueprints.users.models.User.get(id=user_uuid)
|
||||
conversation = await Conversation.create(
|
||||
name=f"{user.username} {datetime.datetime.now().timestamp}",
|
||||
user=user,
|
||||
)
|
||||
|
||||
serialized_conversation = await PydConversation.from_tortoise_orm(conversation)
|
||||
return jsonify(serialized_conversation.model_dump())
|
||||
|
||||
|
||||
@conversation_blueprint.get("/")
|
||||
@jwt_refresh_token_required
|
||||
async def get_all_conversations():
|
||||
user_uuid = get_jwt_identity()
|
||||
user = await blueprints.users.models.User.get(id=user_uuid)
|
||||
conversations = Conversation.filter(user=user)
|
||||
serialized_conversations = await PydListConversation.from_queryset(conversations)
|
||||
|
||||
return jsonify(serialized_conversations.model_dump())
|
||||
78
services/raggr/blueprints/conversation/agents.py
Normal file
@@ -0,0 +1,78 @@
|
||||
import os
|
||||
from typing import cast
|
||||
|
||||
from langchain.agents import create_agent
|
||||
from langchain.chat_models import BaseChatModel
|
||||
from langchain.tools import tool
|
||||
from langchain_ollama import ChatOllama
|
||||
from langchain_openai import ChatOpenAI
|
||||
from tavily import AsyncTavilyClient
|
||||
|
||||
from blueprints.rag.logic import query_vector_store
|
||||
|
||||
openai_gpt_5_mini = ChatOpenAI(model="gpt-5-mini")
|
||||
ollama_deepseek = ChatOllama(model="llama3.1:8b", base_url=os.getenv("OLLAMA_URL"))
|
||||
model_with_fallback = cast(
|
||||
BaseChatModel, ollama_deepseek.with_fallbacks([openai_gpt_5_mini])
|
||||
)
|
||||
client = AsyncTavilyClient(os.getenv("TAVILY_KEY"), "")
|
||||
|
||||
|
||||
@tool
|
||||
async def web_search(query: str) -> str:
|
||||
"""Search the web for current information using Tavily.
|
||||
|
||||
Use this tool when you need to:
|
||||
- Find current information not in the knowledge base
|
||||
- Look up recent events, news, or updates
|
||||
- Verify facts or get additional context
|
||||
- Search for information outside of Simba's documents
|
||||
|
||||
Args:
|
||||
query: The search query to look up on the web
|
||||
|
||||
Returns:
|
||||
Search results from the web with titles, content, and source URLs
|
||||
"""
|
||||
response = await client.search(query=query, search_depth="basic")
|
||||
results = response.get("results", [])
|
||||
|
||||
if not results:
|
||||
return "No results found for the query."
|
||||
|
||||
formatted = "\n\n".join(
|
||||
[
|
||||
f"**{result['title']}**\n{result['content']}\nSource: {result['url']}"
|
||||
for result in results[:5]
|
||||
]
|
||||
)
|
||||
return formatted
|
||||
|
||||
|
||||
@tool(response_format="content_and_artifact")
|
||||
async def simba_search(query: str):
|
||||
"""Search through Simba's medical records, veterinary documents, and personal information.
|
||||
|
||||
Use this tool whenever the user asks questions about:
|
||||
- Simba's health history, medical records, or veterinary visits
|
||||
- Medications, treatments, or diagnoses
|
||||
- Weight, diet, or physical characteristics over time
|
||||
- Veterinary recommendations or advice
|
||||
- Ryan's (the owner's) information related to Simba
|
||||
- Any factual information that would be found in documents
|
||||
|
||||
Args:
|
||||
query: The user's question or information need about Simba
|
||||
|
||||
Returns:
|
||||
Relevant information from Simba's documents
|
||||
"""
|
||||
print(f"[SIMBA SEARCH] Tool called with query: {query}")
|
||||
serialized, docs = await query_vector_store(query=query)
|
||||
print(f"[SIMBA SEARCH] Found {len(docs)} documents")
|
||||
print(f"[SIMBA SEARCH] Serialized result length: {len(serialized)}")
|
||||
print(f"[SIMBA SEARCH] First 200 chars: {serialized[:200]}")
|
||||
return serialized, docs
|
||||
|
||||
|
||||
main_agent = create_agent(model=model_with_fallback, tools=[simba_search, web_search])
|
||||
@@ -1,9 +1,10 @@
|
||||
import tortoise.exceptions
|
||||
|
||||
from .models import Conversation, ConversationMessage
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
import blueprints.users.models
|
||||
|
||||
from .models import Conversation, ConversationMessage, RenameConversationOutputSchema
|
||||
|
||||
|
||||
async def create_conversation(name: str = "") -> Conversation:
|
||||
conversation = await Conversation.create(name=name)
|
||||
@@ -58,3 +59,22 @@ async def get_conversation_transcript(
|
||||
messages.append(f"{message.speaker} at {message.created_at}: {message.text}")
|
||||
|
||||
return "\n".join(messages)
|
||||
|
||||
|
||||
async def rename_conversation(
|
||||
user: blueprints.users.models.User,
|
||||
conversation: Conversation,
|
||||
) -> str:
|
||||
messages: str = await get_conversation_transcript(
|
||||
user=user, conversation=conversation
|
||||
)
|
||||
|
||||
llm = ChatOpenAI(model="gpt-4o-mini")
|
||||
structured_llm = llm.with_structured_output(RenameConversationOutputSchema)
|
||||
|
||||
prompt = f"Summarize the following conversation into a sassy one-liner title:\n\n{messages}"
|
||||
response = structured_llm.invoke(prompt)
|
||||
new_name: str = response.get("title", "")
|
||||
conversation.name = new_name
|
||||
await conversation.save()
|
||||
return new_name
|
||||
@@ -1,11 +1,18 @@
|
||||
import enum
|
||||
from dataclasses import dataclass
|
||||
|
||||
from tortoise.models import Model
|
||||
from tortoise import fields
|
||||
from tortoise.contrib.pydantic import (
|
||||
pydantic_queryset_creator,
|
||||
pydantic_model_creator,
|
||||
pydantic_queryset_creator,
|
||||
)
|
||||
from tortoise.models import Model
|
||||
|
||||
|
||||
@dataclass
|
||||
class RenameConversationOutputSchema:
|
||||
title: str
|
||||
justification: str
|
||||
|
||||
|
||||
class Speaker(enum.Enum):
|
||||
46
services/raggr/blueprints/rag/__init__.py
Normal file
@@ -0,0 +1,46 @@
|
||||
from quart import Blueprint, jsonify
|
||||
from quart_jwt_extended import jwt_refresh_token_required
|
||||
|
||||
from .logic import get_vector_store_stats, index_documents, vector_store
|
||||
|
||||
rag_blueprint = Blueprint("rag_api", __name__, url_prefix="/api/rag")
|
||||
|
||||
|
||||
@rag_blueprint.get("/stats")
|
||||
@jwt_refresh_token_required
|
||||
async def get_stats():
|
||||
"""Get vector store statistics."""
|
||||
stats = get_vector_store_stats()
|
||||
return jsonify(stats)
|
||||
|
||||
|
||||
@rag_blueprint.post("/index")
|
||||
@jwt_refresh_token_required
|
||||
async def trigger_index():
|
||||
"""Trigger indexing of documents from Paperless-NGX."""
|
||||
try:
|
||||
await index_documents()
|
||||
stats = get_vector_store_stats()
|
||||
return jsonify({"status": "success", "stats": stats})
|
||||
except Exception as e:
|
||||
return jsonify({"status": "error", "message": str(e)}), 500
|
||||
|
||||
|
||||
@rag_blueprint.post("/reindex")
|
||||
@jwt_refresh_token_required
|
||||
async def trigger_reindex():
|
||||
"""Clear and reindex all documents."""
|
||||
try:
|
||||
# Clear existing documents
|
||||
collection = vector_store._collection
|
||||
all_docs = collection.get()
|
||||
|
||||
if all_docs["ids"]:
|
||||
collection.delete(ids=all_docs["ids"])
|
||||
|
||||
# Reindex
|
||||
await index_documents()
|
||||
stats = get_vector_store_stats()
|
||||
return jsonify({"status": "success", "stats": stats})
|
||||
except Exception as e:
|
||||
return jsonify({"status": "error", "message": str(e)}), 500
|
||||
75
services/raggr/blueprints/rag/fetchers.py
Normal file
@@ -0,0 +1,75 @@
|
||||
import os
|
||||
import tempfile
|
||||
|
||||
import httpx
|
||||
|
||||
|
||||
class PaperlessNGXService:
|
||||
def __init__(self):
|
||||
self.base_url = os.getenv("BASE_URL")
|
||||
self.token = os.getenv("PAPERLESS_TOKEN")
|
||||
self.url = f"http://{os.getenv('BASE_URL')}/api/documents/?tags__id=8"
|
||||
self.headers = {"Authorization": f"Token {os.getenv('PAPERLESS_TOKEN')}"}
|
||||
|
||||
def get_data(self):
|
||||
print(f"Getting data from: {self.url}")
|
||||
r = httpx.get(self.url, headers=self.headers)
|
||||
results = r.json()["results"]
|
||||
|
||||
nextLink = r.json().get("next")
|
||||
|
||||
while nextLink:
|
||||
r = httpx.get(nextLink, headers=self.headers)
|
||||
results += r.json()["results"]
|
||||
nextLink = r.json().get("next")
|
||||
|
||||
return results
|
||||
|
||||
def get_doc_by_id(self, doc_id: int):
|
||||
url = f"http://{os.getenv('BASE_URL')}/api/documents/{doc_id}/"
|
||||
r = httpx.get(url, headers=self.headers)
|
||||
return r.json()
|
||||
|
||||
def download_pdf_from_id(self, id: int) -> str:
|
||||
download_url = f"http://{os.getenv('BASE_URL')}/api/documents/{id}/download/"
|
||||
response = httpx.get(
|
||||
download_url, headers=self.headers, follow_redirects=True, timeout=30
|
||||
)
|
||||
response.raise_for_status()
|
||||
# Use a temporary file for the downloaded PDF
|
||||
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".pdf")
|
||||
temp_file.write(response.content)
|
||||
temp_file.close()
|
||||
temp_pdf_path = temp_file.name
|
||||
pdf_to_process = temp_pdf_path
|
||||
return pdf_to_process
|
||||
|
||||
def upload_cleaned_content(self, document_id, data):
|
||||
PUTS_URL = f"http://{os.getenv('BASE_URL')}/api/documents/{document_id}/"
|
||||
r = httpx.put(PUTS_URL, headers=self.headers, data=data)
|
||||
r.raise_for_status()
|
||||
|
||||
def upload_description(self, description_filepath, file, title, exif_date: str):
|
||||
POST_URL = f"http://{os.getenv('BASE_URL')}/api/documents/post_document/"
|
||||
files = {"document": ("description_filepath", file, "application/txt")}
|
||||
data = {
|
||||
"title": title,
|
||||
"create": exif_date,
|
||||
"document_type": 3,
|
||||
"tags": [7],
|
||||
}
|
||||
|
||||
r = httpx.post(POST_URL, headers=self.headers, data=data, files=files)
|
||||
r.raise_for_status()
|
||||
|
||||
def get_tags(self):
|
||||
GET_URL = f"http://{os.getenv('BASE_URL')}/api/tags/"
|
||||
r = httpx.get(GET_URL, headers=self.headers)
|
||||
data = r.json()
|
||||
return {tag["id"]: tag["name"] for tag in data["results"]}
|
||||
|
||||
def get_doctypes(self):
|
||||
GET_URL = f"http://{os.getenv('BASE_URL')}/api/document_types/"
|
||||
r = httpx.get(GET_URL, headers=self.headers)
|
||||
data = r.json()
|
||||
return {doctype["id"]: doctype["name"] for doctype in data["results"]}
|
||||
101
services/raggr/blueprints/rag/logic.py
Normal file
@@ -0,0 +1,101 @@
|
||||
import datetime
|
||||
import os
|
||||
|
||||
from langchain_chroma import Chroma
|
||||
from langchain_core.documents import Document
|
||||
from langchain_openai import OpenAIEmbeddings
|
||||
from langchain_text_splitters import RecursiveCharacterTextSplitter
|
||||
|
||||
from .fetchers import PaperlessNGXService
|
||||
|
||||
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
|
||||
|
||||
vector_store = Chroma(
|
||||
collection_name="simba_docs",
|
||||
embedding_function=embeddings,
|
||||
persist_directory=os.getenv("CHROMADB_PATH", ""),
|
||||
)
|
||||
|
||||
text_splitter = RecursiveCharacterTextSplitter(
|
||||
chunk_size=1000, # chunk size (characters)
|
||||
chunk_overlap=200, # chunk overlap (characters)
|
||||
add_start_index=True, # track index in original document
|
||||
)
|
||||
|
||||
|
||||
def date_to_epoch(date_str: str) -> float:
|
||||
split_date = date_str.split("-")
|
||||
date = datetime.datetime(
|
||||
int(split_date[0]),
|
||||
int(split_date[1]),
|
||||
int(split_date[2]),
|
||||
0,
|
||||
0,
|
||||
0,
|
||||
)
|
||||
|
||||
return date.timestamp()
|
||||
|
||||
|
||||
async def fetch_documents_from_paperless_ngx() -> list[Document]:
|
||||
ppngx = PaperlessNGXService()
|
||||
data = ppngx.get_data()
|
||||
doctypes = ppngx.get_doctypes()
|
||||
documents = []
|
||||
for doc in data:
|
||||
metadata = {
|
||||
"created_date": date_to_epoch(doc["created_date"]),
|
||||
"filename": doc["original_file_name"],
|
||||
"document_type": doctypes.get(doc["document_type"], ""),
|
||||
}
|
||||
documents.append(Document(page_content=doc["content"], metadata=metadata))
|
||||
|
||||
return documents
|
||||
|
||||
|
||||
async def index_documents():
|
||||
documents = await fetch_documents_from_paperless_ngx()
|
||||
|
||||
splits = text_splitter.split_documents(documents)
|
||||
await vector_store.aadd_documents(documents=splits)
|
||||
|
||||
|
||||
async def query_vector_store(query: str):
|
||||
retrieved_docs = await vector_store.asimilarity_search(query, k=2)
|
||||
serialized = "\n\n".join(
|
||||
(f"Source: {doc.metadata}\nContent: {doc.page_content}")
|
||||
for doc in retrieved_docs
|
||||
)
|
||||
return serialized, retrieved_docs
|
||||
|
||||
|
||||
def get_vector_store_stats():
|
||||
"""Get statistics about the vector store."""
|
||||
collection = vector_store._collection
|
||||
count = collection.count()
|
||||
return {
|
||||
"total_documents": count,
|
||||
"collection_name": collection.name,
|
||||
}
|
||||
|
||||
|
||||
def list_all_documents(limit: int = 10):
|
||||
"""List documents in the vector store with their metadata."""
|
||||
collection = vector_store._collection
|
||||
results = collection.get(limit=limit, include=["metadatas", "documents"])
|
||||
|
||||
documents = []
|
||||
for i, doc_id in enumerate(results["ids"]):
|
||||
documents.append(
|
||||
{
|
||||
"id": doc_id,
|
||||
"metadata": results["metadatas"][i]
|
||||
if results.get("metadatas")
|
||||
else None,
|
||||
"content_preview": results["documents"][i][:200]
|
||||
if results.get("documents")
|
||||
else None,
|
||||
}
|
||||
)
|
||||
|
||||
return documents
|
||||
0
services/raggr/blueprints/rag/models.py
Normal file
180
services/raggr/blueprints/users/__init__.py
Normal file
@@ -0,0 +1,180 @@
|
||||
from quart import Blueprint, jsonify, request
|
||||
from quart_jwt_extended import (
|
||||
create_access_token,
|
||||
create_refresh_token,
|
||||
jwt_refresh_token_required,
|
||||
get_jwt_identity,
|
||||
)
|
||||
from .models import User
|
||||
from .oidc_service import OIDCUserService
|
||||
from oidc_config import oidc_config
|
||||
import secrets
|
||||
import httpx
|
||||
from urllib.parse import urlencode
|
||||
import hashlib
|
||||
import base64
|
||||
|
||||
|
||||
user_blueprint = Blueprint("user_api", __name__, url_prefix="/api/user")
|
||||
|
||||
# In-memory storage for OIDC state/PKCE (production: use Redis or database)
|
||||
# Format: {state: {"pkce_verifier": str, "redirect_after_login": str}}
|
||||
_oidc_sessions = {}
|
||||
|
||||
|
||||
@user_blueprint.route("/oidc/login", methods=["GET"])
|
||||
async def oidc_login():
|
||||
"""
|
||||
Initiate OIDC login flow
|
||||
Generates PKCE parameters and redirects to Authelia
|
||||
"""
|
||||
if not oidc_config.validate_config():
|
||||
return jsonify({"error": "OIDC not configured"}), 500
|
||||
|
||||
try:
|
||||
# Generate PKCE parameters
|
||||
code_verifier = secrets.token_urlsafe(64)
|
||||
|
||||
# For PKCE, we need code_challenge = BASE64URL(SHA256(code_verifier))
|
||||
code_challenge = (
|
||||
base64.urlsafe_b64encode(hashlib.sha256(code_verifier.encode()).digest())
|
||||
.decode()
|
||||
.rstrip("=")
|
||||
)
|
||||
|
||||
# Generate state for CSRF protection
|
||||
state = secrets.token_urlsafe(32)
|
||||
|
||||
# Store PKCE verifier and state for callback validation
|
||||
_oidc_sessions[state] = {
|
||||
"pkce_verifier": code_verifier,
|
||||
"redirect_after_login": request.args.get("redirect", "/"),
|
||||
}
|
||||
|
||||
# Get authorization endpoint from discovery
|
||||
discovery = await oidc_config.get_discovery_document()
|
||||
auth_endpoint = discovery.get("authorization_endpoint")
|
||||
|
||||
# Build authorization URL
|
||||
params = {
|
||||
"client_id": oidc_config.client_id,
|
||||
"response_type": "code",
|
||||
"redirect_uri": oidc_config.redirect_uri,
|
||||
"scope": "openid email profile",
|
||||
"state": state,
|
||||
"code_challenge": code_challenge,
|
||||
"code_challenge_method": "S256",
|
||||
}
|
||||
|
||||
auth_url = f"{auth_endpoint}?{urlencode(params)}"
|
||||
|
||||
return jsonify({"auth_url": auth_url})
|
||||
except Exception as e:
|
||||
return jsonify({"error": f"OIDC login failed: {str(e)}"}), 500
|
||||
|
||||
|
||||
@user_blueprint.route("/oidc/callback", methods=["GET"])
|
||||
async def oidc_callback():
|
||||
"""
|
||||
Handle OIDC callback from Authelia
|
||||
Exchanges authorization code for tokens, verifies ID token, and creates/updates user
|
||||
"""
|
||||
# Get authorization code and state from callback
|
||||
code = request.args.get("code")
|
||||
state = request.args.get("state")
|
||||
error = request.args.get("error")
|
||||
|
||||
if error:
|
||||
return jsonify({"error": f"OIDC error: {error}"}), 400
|
||||
|
||||
if not code or not state:
|
||||
return jsonify({"error": "Missing code or state"}), 400
|
||||
|
||||
# Validate state and retrieve PKCE verifier
|
||||
session = _oidc_sessions.pop(state, None)
|
||||
if not session:
|
||||
return jsonify({"error": "Invalid or expired state"}), 400
|
||||
|
||||
pkce_verifier = session["pkce_verifier"]
|
||||
|
||||
# Exchange authorization code for tokens
|
||||
discovery = await oidc_config.get_discovery_document()
|
||||
token_endpoint = discovery.get("token_endpoint")
|
||||
|
||||
token_data = {
|
||||
"grant_type": "authorization_code",
|
||||
"code": code,
|
||||
"redirect_uri": oidc_config.redirect_uri,
|
||||
"client_id": oidc_config.client_id,
|
||||
"client_secret": oidc_config.client_secret,
|
||||
"code_verifier": pkce_verifier,
|
||||
}
|
||||
|
||||
# Use client_secret_post method (credentials in POST body)
|
||||
async with httpx.AsyncClient() as client:
|
||||
token_response = await client.post(token_endpoint, data=token_data)
|
||||
|
||||
if token_response.status_code != 200:
|
||||
return jsonify({"error": f"Failed to exchange code for token: {token_response.text}"}), 400
|
||||
|
||||
tokens = token_response.json()
|
||||
|
||||
id_token = tokens.get("id_token")
|
||||
if not id_token:
|
||||
return jsonify({"error": "No ID token received"}), 400
|
||||
|
||||
# Verify ID token
|
||||
try:
|
||||
claims = await oidc_config.verify_id_token(id_token)
|
||||
except Exception as e:
|
||||
return jsonify({"error": f"ID token verification failed: {str(e)}"}), 400
|
||||
|
||||
# Get or create user from OIDC claims
|
||||
user = await OIDCUserService.get_or_create_user_from_oidc(claims)
|
||||
|
||||
# Issue backend JWT tokens
|
||||
access_token = create_access_token(identity=str(user.id))
|
||||
refresh_token = create_refresh_token(identity=str(user.id))
|
||||
|
||||
# Return tokens to frontend
|
||||
# Frontend will handle storing these and redirecting
|
||||
return jsonify(
|
||||
access_token=access_token,
|
||||
refresh_token=refresh_token,
|
||||
user={"id": str(user.id), "username": user.username, "email": user.email},
|
||||
)
|
||||
|
||||
|
||||
@user_blueprint.route("/refresh", methods=["POST"])
|
||||
@jwt_refresh_token_required
|
||||
async def refresh():
|
||||
"""Refresh access token (unchanged from original)"""
|
||||
user_id = get_jwt_identity()
|
||||
new_token = create_access_token(identity=user_id)
|
||||
return jsonify(access_token=new_token)
|
||||
|
||||
|
||||
# Legacy username/password login - kept for backward compatibility during migration
|
||||
@user_blueprint.route("/login", methods=["POST"])
|
||||
async def login():
|
||||
"""
|
||||
Legacy username/password login
|
||||
This can be removed after full OIDC migration is complete
|
||||
"""
|
||||
data = await request.get_json()
|
||||
username = data.get("username")
|
||||
password = data.get("password")
|
||||
|
||||
user = await User.filter(username=username).first()
|
||||
|
||||
if not user or not user.verify_password(password):
|
||||
return jsonify({"msg": "Invalid credentials"}), 401
|
||||
|
||||
access_token = create_access_token(identity=str(user.id))
|
||||
refresh_token = create_refresh_token(identity=str(user.id))
|
||||
|
||||
return jsonify(
|
||||
access_token=access_token,
|
||||
refresh_token=refresh_token,
|
||||
user={"id": str(user.id), "username": user.username},
|
||||
)
|
||||
@@ -8,8 +8,13 @@ import bcrypt
|
||||
class User(Model):
|
||||
id = fields.UUIDField(primary_key=True)
|
||||
username = fields.CharField(max_length=255)
|
||||
password = fields.BinaryField() # Hashed
|
||||
password = fields.BinaryField(null=True) # Hashed - nullable for OIDC users
|
||||
email = fields.CharField(max_length=100, unique=True)
|
||||
|
||||
# OIDC fields
|
||||
oidc_subject = fields.CharField(max_length=255, unique=True, null=True, index=True) # "sub" claim from OIDC
|
||||
auth_provider = fields.CharField(max_length=50, default="local") # "local" or "oidc"
|
||||
|
||||
created_at = fields.DatetimeField(auto_now_add=True)
|
||||
updated_at = fields.DatetimeField(auto_now=True)
|
||||
|
||||
@@ -23,4 +28,6 @@ class User(Model):
|
||||
)
|
||||
|
||||
def verify_password(self, plain_password: str):
|
||||
if not self.password:
|
||||
return False
|
||||
return bcrypt.checkpw(plain_password.encode("utf-8"), self.password)
|
||||
76
services/raggr/blueprints/users/oidc_service.py
Normal file
@@ -0,0 +1,76 @@
|
||||
"""
|
||||
OIDC User Management Service
|
||||
"""
|
||||
from typing import Dict, Any, Optional
|
||||
from uuid import uuid4
|
||||
from .models import User
|
||||
|
||||
|
||||
class OIDCUserService:
|
||||
"""Service for managing OIDC user authentication and provisioning"""
|
||||
|
||||
@staticmethod
|
||||
async def get_or_create_user_from_oidc(claims: Dict[str, Any]) -> User:
|
||||
"""
|
||||
Get existing user by OIDC subject, or create new user from OIDC claims
|
||||
|
||||
Args:
|
||||
claims: Decoded OIDC ID token claims
|
||||
|
||||
Returns:
|
||||
User object (existing or newly created)
|
||||
"""
|
||||
oidc_subject = claims.get("sub")
|
||||
if not oidc_subject:
|
||||
raise ValueError("No 'sub' claim in ID token")
|
||||
|
||||
# Try to find existing user by OIDC subject
|
||||
user = await User.filter(oidc_subject=oidc_subject).first()
|
||||
|
||||
if user:
|
||||
# Update user info from latest claims (optional)
|
||||
user.email = claims.get("email", user.email)
|
||||
user.username = (
|
||||
claims.get("preferred_username")
|
||||
or claims.get("name")
|
||||
or user.username
|
||||
)
|
||||
await user.save()
|
||||
return user
|
||||
|
||||
# Check if user exists by email (migration case)
|
||||
email = claims.get("email")
|
||||
if email:
|
||||
user = await User.filter(email=email, auth_provider="local").first()
|
||||
if user:
|
||||
# Migrate existing local user to OIDC
|
||||
user.oidc_subject = oidc_subject
|
||||
user.auth_provider = "oidc"
|
||||
user.password = None # Clear password
|
||||
await user.save()
|
||||
return user
|
||||
|
||||
# Create new user from OIDC claims
|
||||
username = (
|
||||
claims.get("preferred_username")
|
||||
or claims.get("name")
|
||||
or claims.get("email", "").split("@")[0]
|
||||
or f"user_{oidc_subject[:8]}"
|
||||
)
|
||||
|
||||
user = await User.create(
|
||||
id=uuid4(),
|
||||
username=username,
|
||||
email=email
|
||||
or f"{oidc_subject}@oidc.local", # Fallback if no email claim
|
||||
oidc_subject=oidc_subject,
|
||||
auth_provider="oidc",
|
||||
password=None,
|
||||
)
|
||||
|
||||
return user
|
||||
|
||||
@staticmethod
|
||||
async def find_user_by_oidc_subject(oidc_subject: str) -> Optional[User]:
|
||||
"""Find user by OIDC subject ID"""
|
||||
return await User.filter(oidc_subject=oidc_subject).first()
|
||||
@@ -1,18 +1,16 @@
|
||||
import httpx
|
||||
import os
|
||||
from pathlib import Path
|
||||
import logging
|
||||
import tempfile
|
||||
import os
|
||||
import sqlite3
|
||||
|
||||
import httpx
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from image_process import describe_simba_image
|
||||
from request import PaperlessNGXService
|
||||
import sqlite3
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
load_dotenv()
|
||||
|
||||
# Configuration from environment variables
|
||||
@@ -89,7 +87,7 @@ if __name__ == "__main__":
|
||||
image_date = description.image_date
|
||||
|
||||
description_filepath = os.path.join(
|
||||
"/Users/ryanchen/Programs/raggr", f"SIMBA_DESCRIBE_001.txt"
|
||||
"/Users/ryanchen/Programs/raggr", "SIMBA_DESCRIBE_001.txt"
|
||||
)
|
||||
file = open(description_filepath, "w+")
|
||||
file.write(image_description)
|
||||
92
services/raggr/inspect_vector_store.py
Normal file
@@ -0,0 +1,92 @@
|
||||
#!/usr/bin/env python3
|
||||
"""CLI tool to inspect the vector store contents."""
|
||||
|
||||
import argparse
|
||||
import os
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from blueprints.rag.logic import (
|
||||
get_vector_store_stats,
|
||||
index_documents,
|
||||
list_all_documents,
|
||||
)
|
||||
|
||||
# Load .env from the root directory
|
||||
root_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "../.."))
|
||||
env_path = os.path.join(root_dir, ".env")
|
||||
load_dotenv(env_path)
|
||||
|
||||
|
||||
def print_stats():
|
||||
"""Print vector store statistics."""
|
||||
stats = get_vector_store_stats()
|
||||
print("=== Vector Store Statistics ===")
|
||||
print(f"Collection Name: {stats['collection_name']}")
|
||||
print(f"Total Documents: {stats['total_documents']}")
|
||||
print()
|
||||
|
||||
|
||||
def print_documents(limit: int = 10, show_content: bool = False):
|
||||
"""Print documents in the vector store."""
|
||||
docs = list_all_documents(limit=limit)
|
||||
print(f"=== Documents (showing {len(docs)} of {limit} requested) ===\n")
|
||||
|
||||
for i, doc in enumerate(docs, 1):
|
||||
print(f"Document {i}:")
|
||||
print(f" ID: {doc['id']}")
|
||||
print(f" Metadata: {doc['metadata']}")
|
||||
if show_content:
|
||||
print(f" Content Preview: {doc['content_preview']}")
|
||||
print()
|
||||
|
||||
|
||||
async def run_index():
|
||||
"""Run the indexing process."""
|
||||
print("Starting indexing process...")
|
||||
await index_documents()
|
||||
print("Indexing complete!")
|
||||
print_stats()
|
||||
|
||||
|
||||
def main():
|
||||
import asyncio
|
||||
|
||||
parser = argparse.ArgumentParser(description="Inspect the vector store contents")
|
||||
parser.add_argument(
|
||||
"--stats", action="store_true", help="Show vector store statistics"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--list", type=int, metavar="N", help="List N documents from the vector store"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--show-content",
|
||||
action="store_true",
|
||||
help="Show content preview when listing documents",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--index",
|
||||
action="store_true",
|
||||
help="Index documents from Paperless-NGX into the vector store",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Handle indexing first if requested
|
||||
if args.index:
|
||||
asyncio.run(run_index())
|
||||
return
|
||||
|
||||
# If no arguments provided, show stats by default
|
||||
if not any([args.stats, args.list]):
|
||||
args.stats = True
|
||||
|
||||
if args.stats:
|
||||
print_stats()
|
||||
|
||||
if args.list:
|
||||
print_documents(limit=args.list, show_content=args.show_content)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,21 +1,19 @@
|
||||
import argparse
|
||||
import datetime
|
||||
import logging
|
||||
import os
|
||||
import sqlite3
|
||||
import time
|
||||
|
||||
import argparse
|
||||
import chromadb
|
||||
import ollama
|
||||
from dotenv import load_dotenv
|
||||
|
||||
|
||||
from request import PaperlessNGXService
|
||||
import chromadb
|
||||
from chunker import Chunker
|
||||
from cleaner import pdf_to_image, summarize_pdf_image
|
||||
from llm import LLMClient
|
||||
from query import QueryGenerator
|
||||
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from request import PaperlessNGXService
|
||||
|
||||
_dotenv_loaded = load_dotenv()
|
||||
|
||||
@@ -36,6 +34,7 @@ parser.add_argument("query", type=str, help="questions about simba's health")
|
||||
parser.add_argument(
|
||||
"--reindex", action="store_true", help="re-index the simba documents"
|
||||
)
|
||||
parser.add_argument("--classify", action="store_true", help="test classification")
|
||||
parser.add_argument("--index", help="index a file")
|
||||
|
||||
ppngx = PaperlessNGXService()
|
||||
@@ -113,13 +112,22 @@ def chunk_text(texts: list[str], collection):
|
||||
)
|
||||
|
||||
|
||||
def classify_query(query: str, transcript: str) -> bool:
|
||||
logging.info("Starting query generation")
|
||||
qg_start = time.time()
|
||||
qg = QueryGenerator()
|
||||
query_type = qg.get_query_type(input=query, transcript=transcript)
|
||||
logging.info(query_type)
|
||||
qg_end = time.time()
|
||||
logging.info(f"Query generation took {qg_end - qg_start:.2f} seconds")
|
||||
return query_type == "Simba"
|
||||
|
||||
|
||||
def consult_oracle(
|
||||
input: str,
|
||||
collection,
|
||||
transcript: str = "",
|
||||
):
|
||||
import time
|
||||
|
||||
chunker = Chunker(collection)
|
||||
|
||||
start_time = time.time()
|
||||
@@ -171,6 +179,16 @@ def consult_oracle(
|
||||
return output
|
||||
|
||||
|
||||
def llm_chat(input: str, transcript: str = "") -> str:
|
||||
system_prompt = "You are a helpful assistant that understands veterinary terms."
|
||||
transcript_prompt = f"Here is the message transcript thus far {transcript}."
|
||||
prompt = f"""Answer the user in as if you were a cat named Simba. Don't act too catlike. Be assertive.
|
||||
{transcript_prompt if len(transcript) > 0 else ""}
|
||||
Respond to this prompt: {input}"""
|
||||
output = llm_client.chat(prompt=prompt, system_prompt=system_prompt)
|
||||
return output
|
||||
|
||||
|
||||
def paperless_workflow(input):
|
||||
# Step 1: Get the text
|
||||
ppngx = PaperlessNGXService()
|
||||
@@ -181,12 +199,20 @@ def paperless_workflow(input):
|
||||
|
||||
|
||||
def consult_simba_oracle(input: str, transcript: str = ""):
|
||||
is_simba_related = classify_query(query=input, transcript=transcript)
|
||||
|
||||
if is_simba_related:
|
||||
logging.info("Query is related to simba")
|
||||
return consult_oracle(
|
||||
input=input,
|
||||
collection=simba_docs,
|
||||
transcript=transcript,
|
||||
)
|
||||
|
||||
logging.info("Query is NOT related to simba")
|
||||
|
||||
return llm_chat(input=input, transcript=transcript)
|
||||
|
||||
|
||||
def filter_indexed_files(docs):
|
||||
with sqlite3.connect("database/visited.db") as conn:
|
||||
@@ -202,9 +228,17 @@ def filter_indexed_files(docs):
|
||||
return [doc for doc in docs if doc["id"] not in visited]
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
args = parser.parse_args()
|
||||
if args.reindex:
|
||||
def reindex():
|
||||
with sqlite3.connect("database/visited.db") as conn:
|
||||
c = conn.cursor()
|
||||
c.execute("DELETE FROM indexed_documents")
|
||||
conn.commit()
|
||||
|
||||
# Delete all documents from the collection
|
||||
all_docs = simba_docs.get()
|
||||
if all_docs["ids"]:
|
||||
simba_docs.delete(ids=all_docs["ids"])
|
||||
|
||||
logging.info("Fetching documents from Paperless-NGX")
|
||||
ppngx = PaperlessNGXService()
|
||||
docs = ppngx.get_data()
|
||||
@@ -219,21 +253,20 @@ if __name__ == "__main__":
|
||||
|
||||
# Chunk documents
|
||||
logging.info("Chunking documents now ...")
|
||||
tag_lookup = ppngx.get_tags()
|
||||
doctype_lookup = ppngx.get_doctypes()
|
||||
chunk_data(docs, collection=simba_docs, doctypes=doctype_lookup)
|
||||
logging.info("Done chunking documents")
|
||||
|
||||
# if args.index:
|
||||
# with open(args.index) as file:
|
||||
# extension = args.index.split(".")[-1]
|
||||
# if extension == "pdf":
|
||||
# pdf_path = ppngx.download_pdf_from_id(id=document_id)
|
||||
# image_paths = pdf_to_image(filepath=pdf_path)
|
||||
# print(f"summarizing {file}")
|
||||
# generated_summary = summarize_pdf_image(filepaths=image_paths)
|
||||
# elif extension in [".md", ".txt"]:
|
||||
# chunk_text(texts=[file.readall()], collection=simba_docs)
|
||||
|
||||
if __name__ == "__main__":
|
||||
args = parser.parse_args()
|
||||
if args.reindex:
|
||||
reindex()
|
||||
|
||||
if args.classify:
|
||||
consult_simba_oracle(input="yohohoho testing")
|
||||
consult_simba_oracle(input="write an email")
|
||||
consult_simba_oracle(input="how much does simba weigh")
|
||||
|
||||
if args.query:
|
||||
logging.info("Consulting oracle ...")
|
||||
121
services/raggr/manage_vectorstore.py
Normal file
@@ -0,0 +1,121 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Management script for vector store operations."""
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import sys
|
||||
|
||||
from blueprints.rag.logic import (
|
||||
get_vector_store_stats,
|
||||
index_documents,
|
||||
list_all_documents,
|
||||
vector_store,
|
||||
)
|
||||
|
||||
|
||||
def stats():
|
||||
"""Show vector store statistics."""
|
||||
stats = get_vector_store_stats()
|
||||
print("=== Vector Store Statistics ===")
|
||||
print(f"Collection: {stats['collection_name']}")
|
||||
print(f"Total Documents: {stats['total_documents']}")
|
||||
|
||||
|
||||
async def index():
|
||||
"""Index documents from Paperless-NGX."""
|
||||
print("Starting indexing process...")
|
||||
print("Fetching documents from Paperless-NGX...")
|
||||
await index_documents()
|
||||
print("✓ Indexing complete!")
|
||||
stats()
|
||||
|
||||
|
||||
async def reindex():
|
||||
"""Clear and reindex all documents."""
|
||||
print("Clearing existing documents...")
|
||||
collection = vector_store._collection
|
||||
all_docs = collection.get()
|
||||
|
||||
if all_docs["ids"]:
|
||||
print(f"Deleting {len(all_docs['ids'])} existing documents...")
|
||||
collection.delete(ids=all_docs["ids"])
|
||||
print("✓ Cleared")
|
||||
else:
|
||||
print("Collection is already empty")
|
||||
|
||||
await index()
|
||||
|
||||
|
||||
def list_docs(limit: int = 10, show_content: bool = False):
|
||||
"""List documents in the vector store."""
|
||||
docs = list_all_documents(limit=limit)
|
||||
print(f"\n=== Documents (showing {len(docs)}) ===\n")
|
||||
|
||||
for i, doc in enumerate(docs, 1):
|
||||
print(f"Document {i}:")
|
||||
print(f" ID: {doc['id']}")
|
||||
print(f" Metadata: {doc['metadata']}")
|
||||
if show_content:
|
||||
print(f" Content: {doc['content_preview']}")
|
||||
print()
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Manage vector store for RAG system",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
%(prog)s stats # Show vector store statistics
|
||||
%(prog)s index # Index new documents from Paperless-NGX
|
||||
%(prog)s reindex # Clear and reindex all documents
|
||||
%(prog)s list 10 # List first 10 documents
|
||||
%(prog)s list 20 --show-content # List 20 documents with content preview
|
||||
""",
|
||||
)
|
||||
|
||||
subparsers = parser.add_subparsers(dest="command", help="Command to execute")
|
||||
|
||||
# Stats command
|
||||
subparsers.add_parser("stats", help="Show vector store statistics")
|
||||
|
||||
# Index command
|
||||
subparsers.add_parser("index", help="Index documents from Paperless-NGX")
|
||||
|
||||
# Reindex command
|
||||
subparsers.add_parser("reindex", help="Clear and reindex all documents")
|
||||
|
||||
# List command
|
||||
list_parser = subparsers.add_parser("list", help="List documents in vector store")
|
||||
list_parser.add_argument(
|
||||
"limit", type=int, default=10, nargs="?", help="Number of documents to list"
|
||||
)
|
||||
list_parser.add_argument(
|
||||
"--show-content", action="store_true", help="Show content preview"
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.command:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
if args.command == "stats":
|
||||
stats()
|
||||
elif args.command == "index":
|
||||
asyncio.run(index())
|
||||
elif args.command == "reindex":
|
||||
asyncio.run(reindex())
|
||||
elif args.command == "list":
|
||||
list_docs(limit=args.limit, show_content=args.show_content)
|
||||
except KeyboardInterrupt:
|
||||
print("\n\nOperation cancelled by user")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
print(f"\n❌ Error: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
71
services/raggr/migrations/models/0_20251225052005_init.py
Normal file
@@ -0,0 +1,71 @@
|
||||
from tortoise import BaseDBAsyncClient
|
||||
|
||||
RUN_IN_TRANSACTION = True
|
||||
|
||||
|
||||
async def upgrade(db: BaseDBAsyncClient) -> str:
|
||||
return """
|
||||
CREATE TABLE IF NOT EXISTS "users" (
|
||||
"id" UUID NOT NULL PRIMARY KEY,
|
||||
"username" VARCHAR(255) NOT NULL,
|
||||
"password" BYTEA,
|
||||
"email" VARCHAR(100) NOT NULL UNIQUE,
|
||||
"oidc_subject" VARCHAR(255) UNIQUE,
|
||||
"auth_provider" VARCHAR(50) NOT NULL DEFAULT 'local',
|
||||
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"updated_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
CREATE INDEX IF NOT EXISTS "idx_users_oidc_su_5aec5a" ON "users" ("oidc_subject");
|
||||
CREATE TABLE IF NOT EXISTS "conversations" (
|
||||
"id" UUID NOT NULL PRIMARY KEY,
|
||||
"name" VARCHAR(255) NOT NULL,
|
||||
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"updated_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"user_id" UUID REFERENCES "users" ("id") ON DELETE CASCADE
|
||||
);
|
||||
CREATE TABLE IF NOT EXISTS "conversation_messages" (
|
||||
"id" UUID NOT NULL PRIMARY KEY,
|
||||
"text" TEXT NOT NULL,
|
||||
"created_at" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
"speaker" VARCHAR(10) NOT NULL,
|
||||
"conversation_id" UUID NOT NULL REFERENCES "conversations" ("id") ON DELETE CASCADE
|
||||
);
|
||||
COMMENT ON COLUMN "conversation_messages"."speaker" IS 'USER: user\nSIMBA: simba';
|
||||
CREATE TABLE IF NOT EXISTS "aerich" (
|
||||
"id" SERIAL NOT NULL PRIMARY KEY,
|
||||
"version" VARCHAR(255) NOT NULL,
|
||||
"app" VARCHAR(100) NOT NULL,
|
||||
"content" JSONB NOT NULL
|
||||
);"""
|
||||
|
||||
|
||||
async def downgrade(db: BaseDBAsyncClient) -> str:
|
||||
return """
|
||||
"""
|
||||
|
||||
|
||||
MODELS_STATE = (
|
||||
"eJztmmtP4zgUhv9KlE+MxCLoUGaEViulpex0Z9qO2nR3LjuK3MRtvSROJnYGKsR/X9u5J0"
|
||||
"56AUqL+gXosU9sPz7OeY/Lveq4FrTJSdvFv6BPAEUuVi+VexUDB7I/pO3Higo8L23lBgom"
|
||||
"tnAwMz1FC5gQ6gOTssYpsAlkJgsS00deNBgObJsbXZN1RHiWmgKMfgbQoO4M0jn0WcP3H8"
|
||||
"yMsAXvIIk/ejfGFEHbys0bWXxsYTfowhO28bh7dS168uEmhunagYPT3t6Czl2cdA8CZJ1w"
|
||||
"H942gxj6gEIrsww+y2jZsSmcMTNQP4DJVK3UYMEpCGwOQ/19GmCTM1DESPzH+R/qGngYao"
|
||||
"4WYcpZ3D+Eq0rXLKwqH6r9QRsevb14I1bpEjrzRaMgoj4IR0BB6Cq4piDF7xLK9hz4cpRx"
|
||||
"/wJMNtFNMMaGlGMaQzHIGNBm1FQH3Bk2xDM6Zx8bzWYNxr+1oSDJegmULovrMOr7UVMjbO"
|
||||
"NIU4SmD/mSDUDLIK9YC0UOlMPMexaQWpHrSfzHjgJma7AG2F5Eh6CGr97tdUa61vvMV+IQ"
|
||||
"8tMWiDS9w1sawrooWI8uCluRPET5p6t/UPhH5dug3ynGftJP/6byOYGAugZ2bw1gZc5rbI"
|
||||
"3B5DY28KwNNzbvedjYF93YaPKZfSXQN9bLIBmXR6SRaG5b3MTNkwZPvdMbac7gMMrwrl0f"
|
||||
"ohn+CBcCYZfNA2BTliwi0TGOHrOr0FJrOgsf3CZqJBsUbHVsTZCG2VMbtbWrjioYToB5cw"
|
||||
"t8y6iA6UBCwAySMtBW5Hn9cQjtRJrJWWYFXC984m6+VarYClZuw80wytErNzkNp2gBmK3b"
|
||||
"isbmI9XQWaKCMxBXE8NGdiMPonivRTGFd5KUrzOrHGXcf19EcV0q73zRc1k8lr5HPe3Lm1"
|
||||
"wm/zTo/xl3z0jl9qdB66CQX6OQKitk4kFwIxMDvIDs4MApSYHc7mbcX/joqONRZ3ip8Iz+"
|
||||
"Lx51ey3tUiHImQB1tS3OVZlnpysUmWenlTUmbyocoGyiWe81L3F9ynf+nkpYs3Dh9UgpW7"
|
||||
"w/21mKSzWtJFzW1bbPqeREzSCRbnEtUa3V+NE+aLP912Z8H9e9tMz67ItG28LFpQcIuXV9"
|
||||
"SWS2EAb+Qg4z61WAOVnQsP7Z1ZJeBq/F9WpWbjFkrW5fG36VS964fzZuW1/1jlagCx2A7H"
|
||||
"WiNHF4mhBdfuKfMkDPTlcTPXWqpyR7XGSZBgkm/0FTUjlUkyz6bQS0GKTb5fksB55p+bnh"
|
||||
"+e4vZFWJdjnQkuP23qKq7ZrAfkQaynNtrhKmzeoobZa1+aG4fZ3F7eHrn1exscntcqlIWX"
|
||||
"Y1X/pfh6e5n99lgbTde3kN+sicq5J6Lmo5rqvoQNpnZ0q6Lq64IpZWdBxzIRiinX9RYSe+"
|
||||
"HfmtcXb+7vz924vz96yLmElieVfzMuj29SUVHD8I0muXav2RcTnUb6mcY0djHREXdt9PgM"
|
||||
"9SX7ARKcSS9P7XaNCvvE6NXQogx5gt8LuFTHqs2IjQH7uJtYYiX3X9dz/Fr3kKuZk/oCW7"
|
||||
"eN3mZeHD/9BpOYI="
|
||||
)
|
||||
113
services/raggr/oidc_config.py
Normal file
@@ -0,0 +1,113 @@
|
||||
"""
|
||||
OIDC Configuration for Authelia Integration
|
||||
"""
|
||||
import os
|
||||
from typing import Dict, Any
|
||||
from authlib.jose import jwt
|
||||
from authlib.jose.errors import JoseError
|
||||
import httpx
|
||||
|
||||
|
||||
class OIDCConfig:
|
||||
"""OIDC Configuration Manager"""
|
||||
|
||||
def __init__(self):
|
||||
# Load from environment variables
|
||||
self.issuer = os.getenv("OIDC_ISSUER") # e.g., https://auth.example.com
|
||||
self.client_id = os.getenv("OIDC_CLIENT_ID")
|
||||
self.client_secret = os.getenv("OIDC_CLIENT_SECRET")
|
||||
self.redirect_uri = os.getenv(
|
||||
"OIDC_REDIRECT_URI", "http://localhost:8080/api/user/oidc/callback"
|
||||
)
|
||||
|
||||
# OIDC endpoints (can use discovery or manual config)
|
||||
self.use_discovery = os.getenv("OIDC_USE_DISCOVERY", "true").lower() == "true"
|
||||
|
||||
# Manual endpoint configuration (fallback if discovery fails)
|
||||
self.authorization_endpoint = os.getenv("OIDC_AUTHORIZATION_ENDPOINT")
|
||||
self.token_endpoint = os.getenv("OIDC_TOKEN_ENDPOINT")
|
||||
self.userinfo_endpoint = os.getenv("OIDC_USERINFO_ENDPOINT")
|
||||
self.jwks_uri = os.getenv("OIDC_JWKS_URI")
|
||||
|
||||
# Cached discovery document and JWKS
|
||||
self._discovery_doc: Dict[str, Any] | None = None
|
||||
self._jwks: Dict[str, Any] | None = None
|
||||
|
||||
def validate_config(self) -> bool:
|
||||
"""Validate that required configuration is present"""
|
||||
if not self.issuer or not self.client_id or not self.client_secret:
|
||||
return False
|
||||
return True
|
||||
|
||||
async def get_discovery_document(self) -> Dict[str, Any]:
|
||||
"""Fetch OIDC discovery document from .well-known endpoint"""
|
||||
if self._discovery_doc:
|
||||
return self._discovery_doc
|
||||
|
||||
if not self.use_discovery:
|
||||
# Return manual configuration
|
||||
return {
|
||||
"issuer": self.issuer,
|
||||
"authorization_endpoint": self.authorization_endpoint,
|
||||
"token_endpoint": self.token_endpoint,
|
||||
"userinfo_endpoint": self.userinfo_endpoint,
|
||||
"jwks_uri": self.jwks_uri,
|
||||
}
|
||||
|
||||
discovery_url = f"{self.issuer.rstrip('/')}/.well-known/openid-configuration"
|
||||
|
||||
async with httpx.AsyncClient() as client:
|
||||
response = await client.get(discovery_url)
|
||||
response.raise_for_status()
|
||||
self._discovery_doc = response.json()
|
||||
return self._discovery_doc
|
||||
|
||||
async def get_jwks(self) -> Dict[str, Any]:
|
||||
"""Fetch JSON Web Key Set for token verification"""
|
||||
if self._jwks:
|
||||
return self._jwks
|
||||
|
||||
discovery = await self.get_discovery_document()
|
||||
jwks_uri = discovery.get("jwks_uri")
|
||||
|
||||
if not jwks_uri:
|
||||
raise ValueError("No jwks_uri found in discovery document")
|
||||
|
||||
async with httpx.AsyncClient() as client:
|
||||
response = await client.get(jwks_uri)
|
||||
response.raise_for_status()
|
||||
self._jwks = response.json()
|
||||
return self._jwks
|
||||
|
||||
async def verify_id_token(self, id_token: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Verify and decode ID token from OIDC provider
|
||||
|
||||
Returns the decoded claims if valid
|
||||
Raises exception if invalid
|
||||
"""
|
||||
jwks = await self.get_jwks()
|
||||
|
||||
try:
|
||||
# Verify token signature and claims
|
||||
claims = jwt.decode(
|
||||
id_token,
|
||||
jwks,
|
||||
claims_options={
|
||||
"iss": {"essential": True, "value": self.issuer},
|
||||
"aud": {"essential": True, "value": self.client_id},
|
||||
"exp": {"essential": True},
|
||||
},
|
||||
)
|
||||
|
||||
# Additional validation
|
||||
claims.validate()
|
||||
|
||||
return claims
|
||||
|
||||
except JoseError as e:
|
||||
raise ValueError(f"Invalid ID token: {str(e)}")
|
||||
|
||||
|
||||
# Global instance
|
||||
oidc_config = OIDCConfig()
|
||||
44
services/raggr/pyproject.toml
Normal file
@@ -0,0 +1,44 @@
|
||||
[project]
|
||||
name = "raggr"
|
||||
version = "0.1.0"
|
||||
description = "Add your description here"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.13"
|
||||
dependencies = [
|
||||
"chromadb>=1.1.0",
|
||||
"python-dotenv>=1.0.0",
|
||||
"flask>=3.1.2",
|
||||
"httpx>=0.28.1",
|
||||
"ollama>=0.6.0",
|
||||
"openai>=2.0.1",
|
||||
"pydantic>=2.11.9",
|
||||
"pillow>=10.0.0",
|
||||
"pymupdf>=1.24.0",
|
||||
"black>=25.9.0",
|
||||
"pillow-heif>=1.1.1",
|
||||
"flask-jwt-extended>=4.7.1",
|
||||
"bcrypt>=5.0.0",
|
||||
"pony>=0.7.19",
|
||||
"flask-login>=0.6.3",
|
||||
"quart>=0.20.0",
|
||||
"tortoise-orm>=0.25.1",
|
||||
"quart-jwt-extended>=0.1.0",
|
||||
"pre-commit>=4.3.0",
|
||||
"tortoise-orm-stubs>=1.0.2",
|
||||
"aerich>=0.8.0",
|
||||
"tomlkit>=0.13.3",
|
||||
"authlib>=1.3.0",
|
||||
"asyncpg>=0.30.0",
|
||||
"langchain-openai>=1.1.6",
|
||||
"langchain>=1.2.0",
|
||||
"langchain-chroma>=1.0.0",
|
||||
"langchain-community>=0.4.1",
|
||||
"jq>=1.10.0",
|
||||
"langchain-ollama>=1.0.1",
|
||||
"tavily-python>=0.7.17",
|
||||
]
|
||||
|
||||
[tool.aerich]
|
||||
tortoise_orm = "app.TORTOISE_CONFIG"
|
||||
location = "./migrations"
|
||||
src_folder = "./."
|
||||
@@ -49,11 +49,20 @@ DOCTYPE_OPTIONS = [
|
||||
"Letter",
|
||||
]
|
||||
|
||||
QUERY_TYPE_OPTIONS = [
|
||||
"Simba",
|
||||
"Other",
|
||||
]
|
||||
|
||||
|
||||
class DocumentType(BaseModel):
|
||||
type: list[str] = Field(description="type of document", enum=DOCTYPE_OPTIONS)
|
||||
|
||||
|
||||
class QueryType(BaseModel):
|
||||
type: str = Field(desciption="type of query", enum=QUERY_TYPE_OPTIONS)
|
||||
|
||||
|
||||
PROMPT = """
|
||||
You are an information specialist that processes user queries. The current year is 2025. The user queries are all about
|
||||
a cat, Simba, and its records. The types of records are listed below. Using the query, extract the
|
||||
@@ -111,6 +120,27 @@ Query: "Who does Simba know?"
|
||||
Tags: ["Letter", "Documentation"]
|
||||
"""
|
||||
|
||||
QUERY_TYPE_PROMPT = f"""You are an information specialist that processes user queries.
|
||||
A query can have one tag attached from the following options. Based on the query and the transcript which is listed below, determine
|
||||
which of the following options is most appropriate: {",".join(QUERY_TYPE_OPTIONS)}
|
||||
|
||||
### Example 1
|
||||
Query: "Who is Simba's current vet?"
|
||||
Tags: ["Simba"]
|
||||
|
||||
|
||||
### Example 2
|
||||
Query: "What is the capital of Tokyo?"
|
||||
Tags: ["Other"]
|
||||
|
||||
|
||||
### Example 3
|
||||
Query: "Can you help me write an email?"
|
||||
Tags: ["Other"]
|
||||
|
||||
TRANSCRIPT:
|
||||
"""
|
||||
|
||||
|
||||
class QueryGenerator:
|
||||
def __init__(self) -> None:
|
||||
@@ -154,6 +184,33 @@ class QueryGenerator:
|
||||
metadata_query = {"document_type": {"$in": type_data["type"]}}
|
||||
return metadata_query
|
||||
|
||||
def get_query_type(self, input: str, transcript: str):
|
||||
client = OpenAI()
|
||||
response = client.chat.completions.create(
|
||||
messages=[
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are an information specialist that is really good at deciding what tags a query should have",
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": f"{QUERY_TYPE_PROMPT}\nTRANSCRIPT:\n{transcript}\nQUERY:{input}",
|
||||
},
|
||||
],
|
||||
model="gpt-4o",
|
||||
response_format={
|
||||
"type": "json_schema",
|
||||
"json_schema": {
|
||||
"name": "query_type",
|
||||
"schema": QueryType.model_json_schema(),
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
response_json_str = response.choices[0].message.content
|
||||
type_data = json.loads(response_json_str)
|
||||
return type_data["type"]
|
||||
|
||||
def get_query(self, input: str):
|
||||
client = OpenAI()
|
||||
response = client.responses.parse(
|
||||
9
services/raggr/raggr-frontend/.dockerignore
Normal file
@@ -0,0 +1,9 @@
|
||||
.git
|
||||
.gitignore
|
||||
README.md
|
||||
.DS_Store
|
||||
node_modules
|
||||
dist
|
||||
.cache
|
||||
coverage
|
||||
*.log
|
||||
@@ -6,6 +6,7 @@
|
||||
# Dist
|
||||
node_modules
|
||||
dist/
|
||||
.yarn
|
||||
|
||||
# Profile
|
||||
.rspack-profile-*/
|
||||
1
services/raggr/raggr-frontend/.yarnrc.yml
Normal file
@@ -0,0 +1 @@
|
||||
nodeLinker: node-modules
|
||||
18
services/raggr/raggr-frontend/Dockerfile.dev
Normal file
@@ -0,0 +1,18 @@
|
||||
FROM node:20-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Copy package files
|
||||
COPY package.json yarn.lock* ./
|
||||
|
||||
# Install dependencies
|
||||
RUN yarn install
|
||||
|
||||
# Copy application source code
|
||||
COPY . .
|
||||
|
||||
# Expose rsbuild dev server port (default 3000)
|
||||
EXPOSE 3000
|
||||
|
||||
# Default command
|
||||
CMD ["sh", "-c", "yarn build && yarn watch:build"]
|
||||
@@ -20,6 +20,7 @@
|
||||
"watch": "^1.0.2"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@biomejs/biome": "2.3.10",
|
||||
"@rsbuild/core": "^1.5.6",
|
||||
"@rsbuild/plugin-react": "^1.4.0",
|
||||
"@tailwindcss/postcss": "^4.0.0",
|
||||
@@ -3,4 +3,5 @@
|
||||
body {
|
||||
margin: 0;
|
||||
font-family: Inter, Avenir, Helvetica, Arial, sans-serif;
|
||||
background-color: #F9F5EB;
|
||||
}
|
||||
@@ -24,7 +24,7 @@ const AppContainer = () => {
|
||||
|
||||
// Try to verify token by making a request
|
||||
try {
|
||||
await conversationService.getMessages();
|
||||
await conversationService.getAllConversations();
|
||||
// If successful, user is authenticated
|
||||
setAuthenticated(true);
|
||||
} catch (error) {
|
||||
@@ -37,7 +37,7 @@ class ConversationService {
|
||||
conversation_id: string,
|
||||
): Promise<QueryResponse> {
|
||||
const response = await userService.fetchWithRefreshToken(
|
||||
`${this.baseUrl}/query`,
|
||||
`${this.conversationBaseUrl}/query`,
|
||||
{
|
||||
method: "POST",
|
||||
body: JSON.stringify({ query, conversation_id }),
|
||||
94
services/raggr/raggr-frontend/src/api/oidcService.ts
Normal file
@@ -0,0 +1,94 @@
|
||||
/**
|
||||
* OIDC Authentication Service
|
||||
* Handles OAuth 2.0 Authorization Code flow with PKCE
|
||||
*/
|
||||
|
||||
interface OIDCLoginResponse {
|
||||
auth_url: string;
|
||||
}
|
||||
|
||||
interface OIDCCallbackResponse {
|
||||
access_token: string;
|
||||
refresh_token: string;
|
||||
user: {
|
||||
id: string;
|
||||
username: string;
|
||||
email: string;
|
||||
};
|
||||
}
|
||||
|
||||
class OIDCService {
|
||||
private baseUrl = "/api/user/oidc";
|
||||
|
||||
/**
|
||||
* Initiate OIDC login flow
|
||||
* Returns authorization URL to redirect user to
|
||||
*/
|
||||
async initiateLogin(redirectAfterLogin: string = "/"): Promise<string> {
|
||||
const response = await fetch(
|
||||
`${this.baseUrl}/login?redirect=${encodeURIComponent(redirectAfterLogin)}`,
|
||||
{
|
||||
method: "GET",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
}
|
||||
);
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error("Failed to initiate OIDC login");
|
||||
}
|
||||
|
||||
const data: OIDCLoginResponse = await response.json();
|
||||
return data.auth_url;
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle OIDC callback
|
||||
* Exchanges authorization code for tokens
|
||||
*/
|
||||
async handleCallback(
|
||||
code: string,
|
||||
state: string
|
||||
): Promise<OIDCCallbackResponse> {
|
||||
const response = await fetch(
|
||||
`${this.baseUrl}/callback?code=${encodeURIComponent(code)}&state=${encodeURIComponent(state)}`,
|
||||
{
|
||||
method: "GET",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
}
|
||||
);
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error("OIDC callback failed");
|
||||
}
|
||||
|
||||
return await response.json();
|
||||
}
|
||||
|
||||
/**
|
||||
* Extract OIDC callback parameters from URL
|
||||
*/
|
||||
getCallbackParamsFromURL(): { code: string; state: string } | null {
|
||||
const params = new URLSearchParams(window.location.search);
|
||||
const code = params.get("code");
|
||||
const state = params.get("state");
|
||||
|
||||
if (code && state) {
|
||||
return { code, state };
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Clear callback parameters from URL without reload
|
||||
*/
|
||||
clearCallbackParams(): void {
|
||||
const url = new URL(window.location.href);
|
||||
url.searchParams.delete("code");
|
||||
url.searchParams.delete("state");
|
||||
url.searchParams.delete("error");
|
||||
window.history.replaceState({}, "", url.toString());
|
||||
}
|
||||
}
|
||||
|
||||
export const oidcService = new OIDCService();
|
||||
@@ -4,6 +4,7 @@ interface LoginResponse {
|
||||
user: {
|
||||
id: string;
|
||||
username: string;
|
||||
email?: string;
|
||||
};
|
||||
}
|
||||
|
||||
@@ -55,6 +56,21 @@ class UserService {
|
||||
return data.access_token;
|
||||
}
|
||||
|
||||
async validateToken(): Promise<boolean> {
|
||||
const refreshToken = localStorage.getItem("refresh_token");
|
||||
|
||||
if (!refreshToken) {
|
||||
return false;
|
||||
}
|
||||
|
||||
try {
|
||||
await this.refreshToken();
|
||||
return true;
|
||||
} catch (error) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
async fetchWithAuth(
|
||||
url: string,
|
||||
options: RequestInit = {},
|
||||
BIN
services/raggr/raggr-frontend/src/assets/cat.png
Normal file
|
After Width: | Height: | Size: 5.8 KiB |
|
Before Width: | Height: | Size: 163 B After Width: | Height: | Size: 163 B |
@@ -7,7 +7,7 @@ type AnswerBubbleProps = {
|
||||
|
||||
export const AnswerBubble = ({ text, loading }: AnswerBubbleProps) => {
|
||||
return (
|
||||
<div className="rounded-md bg-orange-100 p-3">
|
||||
<div className="rounded-md bg-orange-100 p-3 sm:p-4 w-2/3">
|
||||
{loading ? (
|
||||
<div className="flex flex-col w-full animate-pulse gap-2">
|
||||
<div className="flex flex-row gap-2 w-full">
|
||||
@@ -20,8 +20,10 @@ export const AnswerBubble = ({ text, loading }: AnswerBubbleProps) => {
|
||||
</div>
|
||||
</div>
|
||||
) : (
|
||||
<div className="flex flex-col">
|
||||
<ReactMarkdown>{"🐈: " + text}</ReactMarkdown>
|
||||
<div className=" flex flex-col break-words overflow-wrap-anywhere text-sm sm:text-base [&>*]:break-words">
|
||||
<ReactMarkdown>
|
||||
{"🐈: " + text}
|
||||
</ReactMarkdown>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
310
services/raggr/raggr-frontend/src/components/ChatScreen.tsx
Normal file
@@ -0,0 +1,310 @@
|
||||
import { useEffect, useState, useRef } from "react";
|
||||
import { conversationService } from "../api/conversationService";
|
||||
import { QuestionBubble } from "./QuestionBubble";
|
||||
import { AnswerBubble } from "./AnswerBubble";
|
||||
import { MessageInput } from "./MessageInput";
|
||||
import { ConversationList } from "./ConversationList";
|
||||
import catIcon from "../assets/cat.png";
|
||||
|
||||
type Message = {
|
||||
text: string;
|
||||
speaker: "simba" | "user";
|
||||
};
|
||||
|
||||
type QuestionAnswer = {
|
||||
question: string;
|
||||
answer: string;
|
||||
};
|
||||
|
||||
type Conversation = {
|
||||
title: string;
|
||||
id: string;
|
||||
};
|
||||
|
||||
type ChatScreenProps = {
|
||||
setAuthenticated: (isAuth: boolean) => void;
|
||||
};
|
||||
|
||||
export const ChatScreen = ({ setAuthenticated }: ChatScreenProps) => {
|
||||
const [query, setQuery] = useState<string>("");
|
||||
const [answer, setAnswer] = useState<string>("");
|
||||
const [simbaMode, setSimbaMode] = useState<boolean>(false);
|
||||
const [questionsAnswers, setQuestionsAnswers] = useState<QuestionAnswer[]>(
|
||||
[],
|
||||
);
|
||||
const [messages, setMessages] = useState<Message[]>([]);
|
||||
const [conversations, setConversations] = useState<Conversation[]>([
|
||||
{ title: "simba meow meow", id: "uuid" },
|
||||
]);
|
||||
const [showConversations, setShowConversations] = useState<boolean>(false);
|
||||
const [selectedConversation, setSelectedConversation] =
|
||||
useState<Conversation | null>(null);
|
||||
const [sidebarCollapsed, setSidebarCollapsed] = useState<boolean>(false);
|
||||
const [isLoading, setIsLoading] = useState<boolean>(false);
|
||||
|
||||
const messagesEndRef = useRef<HTMLDivElement>(null);
|
||||
const simbaAnswers = ["meow.", "hiss...", "purrrrrr", "yowOWROWWowowr"];
|
||||
|
||||
const scrollToBottom = () => {
|
||||
messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
|
||||
};
|
||||
|
||||
const handleSelectConversation = (conversation: Conversation) => {
|
||||
setShowConversations(false);
|
||||
setSelectedConversation(conversation);
|
||||
const loadMessages = async () => {
|
||||
try {
|
||||
const fetchedConversation = await conversationService.getConversation(
|
||||
conversation.id,
|
||||
);
|
||||
setMessages(
|
||||
fetchedConversation.messages.map((message) => ({
|
||||
text: message.text,
|
||||
speaker: message.speaker,
|
||||
})),
|
||||
);
|
||||
} catch (error) {
|
||||
console.error("Failed to load messages:", error);
|
||||
}
|
||||
};
|
||||
loadMessages();
|
||||
};
|
||||
|
||||
const loadConversations = async () => {
|
||||
try {
|
||||
const fetchedConversations =
|
||||
await conversationService.getAllConversations();
|
||||
const parsedConversations = fetchedConversations.map((conversation) => ({
|
||||
id: conversation.id,
|
||||
title: conversation.name,
|
||||
}));
|
||||
setConversations(parsedConversations);
|
||||
setSelectedConversation(parsedConversations[0]);
|
||||
console.log(parsedConversations);
|
||||
console.log("JELLYFISH@");
|
||||
} catch (error) {
|
||||
console.error("Failed to load messages:", error);
|
||||
}
|
||||
};
|
||||
|
||||
const handleCreateNewConversation = async () => {
|
||||
const newConversation = await conversationService.createConversation();
|
||||
await loadConversations();
|
||||
setSelectedConversation({
|
||||
title: newConversation.name,
|
||||
id: newConversation.id,
|
||||
});
|
||||
};
|
||||
|
||||
useEffect(() => {
|
||||
loadConversations();
|
||||
}, []);
|
||||
|
||||
useEffect(() => {
|
||||
scrollToBottom();
|
||||
}, [messages]);
|
||||
|
||||
useEffect(() => {
|
||||
const loadMessages = async () => {
|
||||
console.log(selectedConversation);
|
||||
console.log("JELLYFISH");
|
||||
if (selectedConversation == null) return;
|
||||
try {
|
||||
const conversation = await conversationService.getConversation(
|
||||
selectedConversation.id,
|
||||
);
|
||||
// Update the conversation title in case it changed
|
||||
setSelectedConversation({
|
||||
id: conversation.id,
|
||||
title: conversation.name,
|
||||
});
|
||||
setMessages(
|
||||
conversation.messages.map((message) => ({
|
||||
text: message.text,
|
||||
speaker: message.speaker,
|
||||
})),
|
||||
);
|
||||
} catch (error) {
|
||||
console.error("Failed to load messages:", error);
|
||||
}
|
||||
};
|
||||
loadMessages();
|
||||
}, [selectedConversation?.id]);
|
||||
|
||||
const handleQuestionSubmit = async () => {
|
||||
if (!query.trim() || isLoading) return; // Don't submit empty messages or while loading
|
||||
|
||||
const currMessages = messages.concat([{ text: query, speaker: "user" }]);
|
||||
setMessages(currMessages);
|
||||
setQuery(""); // Clear input immediately after submission
|
||||
setIsLoading(true);
|
||||
|
||||
if (simbaMode) {
|
||||
console.log("simba mode activated");
|
||||
const randomIndex = Math.floor(Math.random() * simbaAnswers.length);
|
||||
const randomElement = simbaAnswers[randomIndex];
|
||||
setAnswer(randomElement);
|
||||
setQuestionsAnswers(
|
||||
questionsAnswers.concat([
|
||||
{
|
||||
question: query,
|
||||
answer: randomElement,
|
||||
},
|
||||
]),
|
||||
);
|
||||
setIsLoading(false);
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
const result = await conversationService.sendQuery(
|
||||
query,
|
||||
selectedConversation.id,
|
||||
);
|
||||
setQuestionsAnswers(
|
||||
questionsAnswers.concat([{ question: query, answer: result.response }]),
|
||||
);
|
||||
setMessages(
|
||||
currMessages.concat([{ text: result.response, speaker: "simba" }]),
|
||||
);
|
||||
} catch (error) {
|
||||
console.error("Failed to send query:", error);
|
||||
// If session expired, redirect to login
|
||||
if (error instanceof Error && error.message.includes("Session expired")) {
|
||||
setAuthenticated(false);
|
||||
}
|
||||
} finally {
|
||||
setIsLoading(false);
|
||||
}
|
||||
};
|
||||
|
||||
const handleQueryChange = (event: React.ChangeEvent<HTMLTextAreaElement>) => {
|
||||
setQuery(event.target.value);
|
||||
};
|
||||
|
||||
const handleKeyDown = (event: React.KeyboardEvent<HTMLTextAreaElement>) => {
|
||||
// Submit on Enter, but allow Shift+Enter for new line
|
||||
if (event.key === "Enter" && !event.shiftKey) {
|
||||
event.preventDefault();
|
||||
handleQuestionSubmit();
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="h-screen flex flex-row bg-[#F9F5EB]">
|
||||
{/* Sidebar - Expanded */}
|
||||
<aside
|
||||
className={`hidden md:flex md:flex-col bg-[#F9F5EB] border-r border-gray-200 p-4 overflow-y-auto transition-all duration-300 ${sidebarCollapsed ? "w-20" : "w-64"}`}
|
||||
>
|
||||
{!sidebarCollapsed ? (
|
||||
<div className="bg-[#F9F5EB]">
|
||||
<div className="flex flex-row items-center gap-2 mb-6">
|
||||
<img
|
||||
src={catIcon}
|
||||
alt="Simba"
|
||||
className="cursor-pointer hover:opacity-80"
|
||||
onClick={() => setSidebarCollapsed(true)}
|
||||
/>
|
||||
<h2 className="text-3xl bg-[#F9F5EB] font-semibold">asksimba!</h2>
|
||||
</div>
|
||||
<ConversationList
|
||||
conversations={conversations}
|
||||
onCreateNewConversation={handleCreateNewConversation}
|
||||
onSelectConversation={handleSelectConversation}
|
||||
/>
|
||||
<div className="mt-auto pt-4">
|
||||
<button
|
||||
className="w-full p-2 border border-red-400 bg-red-200 hover:bg-red-400 cursor-pointer rounded-md text-sm"
|
||||
onClick={() => setAuthenticated(false)}
|
||||
>
|
||||
logout
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
) : (
|
||||
<div className="flex flex-col items-center gap-4">
|
||||
<img
|
||||
src={catIcon}
|
||||
alt="Simba"
|
||||
className="cursor-pointer hover:opacity-80"
|
||||
onClick={() => setSidebarCollapsed(false)}
|
||||
/>
|
||||
</div>
|
||||
)}
|
||||
</aside>
|
||||
|
||||
{/* Main chat area */}
|
||||
<div className="flex-1 flex flex-col h-screen overflow-hidden">
|
||||
{/* Mobile header */}
|
||||
<header className="md:hidden flex flex-row justify-between items-center gap-3 p-4 border-b border-gray-200 bg-white">
|
||||
<div className="flex flex-row items-center gap-2">
|
||||
<img src={catIcon} alt="Simba" className="w-10 h-10" />
|
||||
<h1 className="text-xl">asksimba!</h1>
|
||||
</div>
|
||||
<div className="flex flex-row gap-2">
|
||||
<button
|
||||
className="p-2 border border-green-400 bg-green-200 hover:bg-green-400 cursor-pointer rounded-md text-sm"
|
||||
onClick={() => setShowConversations(!showConversations)}
|
||||
>
|
||||
{showConversations ? "hide" : "show"}
|
||||
</button>
|
||||
<button
|
||||
className="p-2 border border-red-400 bg-red-200 hover:bg-red-400 cursor-pointer rounded-md text-sm"
|
||||
onClick={() => setAuthenticated(false)}
|
||||
>
|
||||
logout
|
||||
</button>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
{/* Messages area */}
|
||||
{selectedConversation && (
|
||||
<div className="sticky top-0 mx-auto w-full">
|
||||
<div className="bg-[#F9F5EB] text-black px-6 w-full py-3">
|
||||
<h2 className="text-lg font-semibold">
|
||||
{selectedConversation.title || "Untitled Conversation"}
|
||||
</h2>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
<div className="flex-1 overflow-y-auto relative px-4 py-6">
|
||||
{/* Floating conversation name */}
|
||||
|
||||
<div className="max-w-2xl mx-auto flex flex-col gap-4">
|
||||
{showConversations && (
|
||||
<div className="md:hidden">
|
||||
<ConversationList
|
||||
conversations={conversations}
|
||||
onCreateNewConversation={handleCreateNewConversation}
|
||||
onSelectConversation={handleSelectConversation}
|
||||
/>
|
||||
</div>
|
||||
)}
|
||||
{messages.map((msg, index) => {
|
||||
if (msg.speaker === "simba") {
|
||||
return <AnswerBubble key={index} text={msg.text} />;
|
||||
}
|
||||
return <QuestionBubble key={index} text={msg.text} />;
|
||||
})}
|
||||
{isLoading && <AnswerBubble text="" loading={true} />}
|
||||
<div ref={messagesEndRef} />
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Input area */}
|
||||
<footer className="p-4 bg-[#F9F5EB]">
|
||||
<div className="max-w-2xl mx-auto">
|
||||
<MessageInput
|
||||
query={query}
|
||||
handleQueryChange={handleQueryChange}
|
||||
handleKeyDown={handleKeyDown}
|
||||
handleQuestionSubmit={handleQuestionSubmit}
|
||||
setSimbaMode={setSimbaMode}
|
||||
isLoading={isLoading}
|
||||
/>
|
||||
</div>
|
||||
</footer>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
@@ -22,8 +22,14 @@ export const ConversationList = ({
|
||||
useEffect(() => {
|
||||
const loadConversations = async () => {
|
||||
try {
|
||||
const fetchedConversations =
|
||||
let fetchedConversations =
|
||||
await conversationService.getAllConversations();
|
||||
|
||||
if (conversations.length == 0) {
|
||||
await conversationService.createConversation();
|
||||
fetchedConversations =
|
||||
await conversationService.getAllConversations();
|
||||
}
|
||||
setConversations(
|
||||
fetchedConversations.map((conversation) => ({
|
||||
id: conversation.id,
|
||||
@@ -38,22 +44,25 @@ export const ConversationList = ({
|
||||
}, []);
|
||||
|
||||
return (
|
||||
<div className="bg-indigo-300 rounded-md p-3 flex flex-col">
|
||||
<div className="bg-indigo-300 rounded-md p-3 sm:p-4 flex flex-col gap-1">
|
||||
{conservations.map((conversation) => {
|
||||
return (
|
||||
<div
|
||||
className="border-blue-400 bg-indigo-300 hover:bg-indigo-200 cursor-pointer rounded-md p-2"
|
||||
key={conversation.id}
|
||||
className="border-blue-400 bg-indigo-300 hover:bg-indigo-200 cursor-pointer rounded-md p-3 min-h-[44px] flex items-center"
|
||||
onClick={() => onSelectConversation(conversation)}
|
||||
>
|
||||
<p>{conversation.title}</p>
|
||||
<p className="text-sm sm:text-base truncate w-full">
|
||||
{conversation.title}
|
||||
</p>
|
||||
</div>
|
||||
);
|
||||
})}
|
||||
<div
|
||||
className="border-blue-400 bg-indigo-300 hover:bg-indigo-200 cursor-pointer rounded-md p-2"
|
||||
className="border-blue-400 bg-indigo-300 hover:bg-indigo-200 cursor-pointer rounded-md p-3 min-h-[44px] flex items-center"
|
||||
onClick={() => onCreateNewConversation()}
|
||||
>
|
||||
<p> + Start a new thread</p>
|
||||
<p className="text-sm sm:text-base"> + Start a new thread</p>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
130
services/raggr/raggr-frontend/src/components/LoginScreen.tsx
Normal file
@@ -0,0 +1,130 @@
|
||||
import { useState, useEffect } from "react";
|
||||
import { userService } from "../api/userService";
|
||||
import { oidcService } from "../api/oidcService";
|
||||
|
||||
type LoginScreenProps = {
|
||||
setAuthenticated: (isAuth: boolean) => void;
|
||||
};
|
||||
|
||||
export const LoginScreen = ({ setAuthenticated }: LoginScreenProps) => {
|
||||
const [error, setError] = useState<string>("");
|
||||
const [isChecking, setIsChecking] = useState<boolean>(true);
|
||||
const [isLoggingIn, setIsLoggingIn] = useState<boolean>(false);
|
||||
|
||||
useEffect(() => {
|
||||
const initAuth = async () => {
|
||||
// First, check for OIDC callback parameters
|
||||
const callbackParams = oidcService.getCallbackParamsFromURL();
|
||||
|
||||
if (callbackParams) {
|
||||
// Handle OIDC callback
|
||||
try {
|
||||
setIsLoggingIn(true);
|
||||
const result = await oidcService.handleCallback(
|
||||
callbackParams.code,
|
||||
callbackParams.state
|
||||
);
|
||||
|
||||
// Store tokens
|
||||
localStorage.setItem("access_token", result.access_token);
|
||||
localStorage.setItem("refresh_token", result.refresh_token);
|
||||
|
||||
// Clear URL parameters
|
||||
oidcService.clearCallbackParams();
|
||||
|
||||
setAuthenticated(true);
|
||||
setIsChecking(false);
|
||||
return;
|
||||
} catch (err) {
|
||||
console.error("OIDC callback error:", err);
|
||||
setError("Login failed. Please try again.");
|
||||
oidcService.clearCallbackParams();
|
||||
setIsLoggingIn(false);
|
||||
setIsChecking(false);
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
// Check if user is already authenticated
|
||||
const isValid = await userService.validateToken();
|
||||
if (isValid) {
|
||||
setAuthenticated(true);
|
||||
}
|
||||
setIsChecking(false);
|
||||
};
|
||||
|
||||
initAuth();
|
||||
}, [setAuthenticated]);
|
||||
|
||||
const handleOIDCLogin = async () => {
|
||||
try {
|
||||
setIsLoggingIn(true);
|
||||
setError("");
|
||||
|
||||
// Get authorization URL from backend
|
||||
const authUrl = await oidcService.initiateLogin();
|
||||
|
||||
// Redirect to Authelia
|
||||
window.location.href = authUrl;
|
||||
} catch (err) {
|
||||
setError("Failed to initiate login. Please try again.");
|
||||
console.error("OIDC login error:", err);
|
||||
setIsLoggingIn(false);
|
||||
}
|
||||
};
|
||||
|
||||
// Show loading state while checking authentication or processing callback
|
||||
if (isChecking || isLoggingIn) {
|
||||
return (
|
||||
<div className="h-screen bg-opacity-20">
|
||||
<div className="bg-white/85 h-screen flex items-center justify-center">
|
||||
<div className="text-center">
|
||||
<p className="text-lg sm:text-xl">
|
||||
{isLoggingIn ? "Logging in..." : "Checking authentication..."}
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="h-screen bg-opacity-20">
|
||||
<div className="bg-white/85 h-screen">
|
||||
<div className="flex flex-row justify-center py-4">
|
||||
<div className="flex flex-col gap-4 w-full px-4 sm:w-11/12 sm:max-w-2xl lg:max-w-4xl sm:px-0">
|
||||
<div className="flex flex-col gap-4">
|
||||
<div className="flex flex-grow justify-center w-full bg-amber-400 p-2">
|
||||
<h1 className="text-base sm:text-xl font-bold text-center">
|
||||
I AM LOOKING FOR A DESIGNER. THIS APP WILL REMAIN UGLY UNTIL A
|
||||
DESIGNER COMES.
|
||||
</h1>
|
||||
</div>
|
||||
<header className="flex flex-row justify-center gap-2 grow sticky top-0 z-10 bg-white">
|
||||
<h1 className="text-2xl sm:text-3xl">ask simba!</h1>
|
||||
</header>
|
||||
|
||||
{error && (
|
||||
<div className="text-red-600 font-semibold text-sm sm:text-base bg-red-50 p-3 rounded-md">
|
||||
{error}
|
||||
</div>
|
||||
)}
|
||||
|
||||
<div className="text-center text-sm sm:text-base text-gray-600 py-2">
|
||||
Click below to login with Authelia
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<button
|
||||
className="p-3 sm:p-4 min-h-[44px] border border-blue-400 bg-blue-200 hover:bg-blue-400 cursor-pointer rounded-md flex-grow text-sm sm:text-base font-semibold"
|
||||
onClick={handleOIDCLogin}
|
||||
disabled={isLoggingIn}
|
||||
>
|
||||
{isLoggingIn ? "Redirecting..." : "Login with Authelia"}
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
@@ -0,0 +1,56 @@
|
||||
import { useEffect, useState, useRef } from "react";
|
||||
|
||||
type MessageInputProps = {
|
||||
handleQueryChange: (event: React.ChangeEvent<HTMLTextAreaElement>) => void;
|
||||
handleKeyDown: (event: React.ChangeEvent<HTMLTextAreaElement>) => void;
|
||||
handleQuestionSubmit: () => void;
|
||||
setSimbaMode: (sdf: boolean) => void;
|
||||
query: string;
|
||||
isLoading: boolean;
|
||||
};
|
||||
|
||||
export const MessageInput = ({
|
||||
query,
|
||||
handleKeyDown,
|
||||
handleQueryChange,
|
||||
handleQuestionSubmit,
|
||||
setSimbaMode,
|
||||
isLoading,
|
||||
}: MessageInputProps) => {
|
||||
return (
|
||||
<div className="flex flex-col gap-4 sticky bottom-0 bg-[#3D763A] p-6 rounded-xl">
|
||||
<div className="flex flex-row justify-between grow">
|
||||
<textarea
|
||||
className="p-3 sm:p-4 border border-blue-200 rounded-md grow bg-[#F9F5EB] min-h-[44px] resize-y"
|
||||
onChange={handleQueryChange}
|
||||
onKeyDown={handleKeyDown}
|
||||
value={query}
|
||||
rows={2}
|
||||
placeholder="Type your message... (Press Enter to send, Shift+Enter for new line)"
|
||||
/>
|
||||
</div>
|
||||
<div className="flex flex-row justify-between gap-2 grow">
|
||||
<button
|
||||
className={`p-3 sm:p-4 min-h-[44px] border border-blue-400 rounded-md flex-grow text-sm sm:text-base ${
|
||||
isLoading
|
||||
? "bg-gray-400 cursor-not-allowed opacity-50"
|
||||
: "bg-[#EDA541] hover:bg-blue-400 cursor-pointer"
|
||||
}`}
|
||||
onClick={() => handleQuestionSubmit()}
|
||||
type="submit"
|
||||
disabled={isLoading}
|
||||
>
|
||||
{isLoading ? "Sending..." : "Submit"}
|
||||
</button>
|
||||
</div>
|
||||
<div className="flex flex-row justify-center gap-2 grow items-center">
|
||||
<input
|
||||
type="checkbox"
|
||||
onChange={(event) => setSimbaMode(event.target.checked)}
|
||||
className="w-5 h-5 cursor-pointer"
|
||||
/>
|
||||
<p className="text-sm sm:text-base">simba mode?</p>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
@@ -0,0 +1,11 @@
|
||||
type QuestionBubbleProps = {
|
||||
text: string;
|
||||
};
|
||||
|
||||
export const QuestionBubble = ({ text }: QuestionBubbleProps) => {
|
||||
return (
|
||||
<div className="w-2/3 rounded-md bg-stone-200 p-3 sm:p-4 break-words overflow-wrap-anywhere text-sm sm:text-base ml-auto">
|
||||
🤦: {text}
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
Before Width: | Height: | Size: 3.4 MiB After Width: | Height: | Size: 3.4 MiB |
|
Before Width: | Height: | Size: 2.1 MiB After Width: | Height: | Size: 2.1 MiB |
2877
services/raggr/raggr-frontend/yarn.lock
Normal file
25
services/raggr/startup-dev.sh
Executable file
@@ -0,0 +1,25 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
echo "Initializing directories..."
|
||||
mkdir -p /app/data/chromadb
|
||||
|
||||
echo "Rebuilding frontend..."
|
||||
cd /app/raggr-frontend
|
||||
yarn build
|
||||
cd /app
|
||||
|
||||
echo "Setting up database..."
|
||||
# Give PostgreSQL a moment to be ready (healthcheck in docker-compose handles this)
|
||||
sleep 3
|
||||
|
||||
if ls migrations/models/0_*.py 1> /dev/null 2>&1; then
|
||||
echo "Running database migrations..."
|
||||
aerich upgrade
|
||||
else
|
||||
echo "No migrations found, initializing database..."
|
||||
aerich init-db
|
||||
fi
|
||||
|
||||
echo "Starting Flask application in debug mode..."
|
||||
python app.py
|
||||
39
services/raggr/test_query.py
Normal file
@@ -0,0 +1,39 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Test the query_vector_store function."""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from blueprints.rag.logic import query_vector_store
|
||||
|
||||
# Load .env from the root directory
|
||||
root_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "../.."))
|
||||
env_path = os.path.join(root_dir, ".env")
|
||||
load_dotenv(env_path)
|
||||
|
||||
|
||||
async def test_query(query: str):
|
||||
"""Test a query against the vector store."""
|
||||
print(f"Query: {query}\n")
|
||||
result, docs = await query_vector_store(query)
|
||||
print(f"Found {len(docs)} documents\n")
|
||||
print("Serialized result:")
|
||||
print(result)
|
||||
print("\n" + "=" * 80 + "\n")
|
||||
|
||||
|
||||
async def main():
|
||||
queries = [
|
||||
"What is Simba's weight?",
|
||||
"What medications is Simba taking?",
|
||||
"Tell me about Simba's recent vet visits",
|
||||
]
|
||||
|
||||
for query in queries:
|
||||
await test_query(query)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||