14 Commits

Author SHA1 Message Date
Ryan Chen
3ffc95a1b0 Switch to OpenAI embeddings for ChromaDB
Replace Ollama embedding function with OpenAI's text-embedding-3-small
model for improved embedding quality and consistency.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 21:05:17 -04:00
Ryan Chen
c5091dc07a Configure Docker for Linux host networking and add startup reindex
- Switch to host network mode for direct access to Ollama on host
- Update OLLAMA_URL to use localhost:11434
- Add startup.sh script to trigger reindex before app starts
- Update Dockerfile to execute startup script

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 21:02:55 -04:00
Ryan Chen
c140758560 asfd 2025-10-02 20:57:19 -04:00
Ryan Chen
ab3a0eb442 Reorganize Dockerfile to copy application code before frontend build
Move Python application code copy before frontend build step to improve
Dockerfile organization and ensure all app code is available earlier.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 20:48:52 -04:00
Ryan Chen
c619d78922 Adding axios 2025-10-02 20:46:10 -04:00
Ryan Chen
c20ae0a4b9 Add missing @tailwindcss/postcss dependency to frontend
Fix Docker build failure by adding @tailwindcss/postcss package
required by postcss.config.mjs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 20:44:49 -04:00
Ryan Chen
26cc01b58b Add frontend build step to Dockerfile
Install Node.js and Yarn, then build the raggr-frontend during Docker image build process.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 20:42:01 -04:00
Ryan Chen
746b60e070 Switch to using torrtle/simbarag:latest Docker image
Replace local build with pre-built image from Docker Hub

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 20:39:36 -04:00
Ryan Chen
577c9144ac Switch Dockerfile to use uv for dependency management
- Install uv via official installer script
- Replace pip with uv pip install --system
- Add uv to PATH for container usage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 20:36:45 -04:00
Ryan Chen
2b2891bd79 Fix and add missing dependencies to pyproject.toml
- Fix dotenv package name to python-dotenv
- Add pillow for image processing
- Add pymupdf for PDF handling

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 20:34:59 -04:00
Ryan Chen
03b033e9a4 Configure ollama to use external host instead of docker service
- Update all ollama clients to use configurable OLLAMA_URL environment variable
- Remove ollama service from docker-compose.yml to use external ollama instance
- Configure docker-compose to connect to host ollama via 172.17.0.1:11434 (Linux) or host.docker.internal (macOS/Windows)
- Add cross-platform compatibility with extra_hosts mapping
- Update embedding function fallback URL for consistency

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-02 20:29:48 -04:00
Ryan Chen
a640ae5fed Docker stuff 2025-10-02 20:21:48 -04:00
Ryan Chen
99c98b7e42 yeet 2025-10-02 19:21:24 -04:00
ryan
a69f7864f3 Merge pull request 'yeat' (#3) from rc/9-metadata-date-filtering into main
Reviewed-on: #3
2025-08-07 17:43:59 -04:00
23 changed files with 3625 additions and 38 deletions

16
.dockerignore Normal file
View File

@@ -0,0 +1,16 @@
.git
.gitignore
README.md
.env
.DS_Store
chromadb/
chroma_db/
raggr-frontend/node_modules/
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
.venv/
venv/
.pytest_cache/

1
.python-version Normal file
View File

@@ -0,0 +1 @@
3.13

46
Dockerfile Normal file
View File

@@ -0,0 +1,46 @@
FROM python:3.13-slim
WORKDIR /app
# Install system dependencies, Node.js, Yarn, and uv
RUN apt-get update && apt-get install -y \
build-essential \
curl \
&& curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs \
&& npm install -g yarn \
&& rm -rf /var/lib/apt/lists/* \
&& curl -LsSf https://astral.sh/uv/install.sh | sh
# Add uv to PATH
ENV PATH="/root/.local/bin:$PATH"
# Copy dependency files
COPY pyproject.toml ./
# Install Python dependencies using uv
RUN uv pip install --system -e .
# Copy application code
COPY *.py ./
COPY startup.sh ./
RUN chmod +x startup.sh
# Copy frontend code and build
COPY raggr-frontend ./raggr-frontend
WORKDIR /app/raggr-frontend
RUN yarn install && yarn build
WORKDIR /app
# Create ChromaDB directory
RUN mkdir -p /app/chromadb
# Expose port
EXPOSE 8080
# Set environment variables
ENV PYTHONPATH=/app
ENV CHROMADB_PATH=/app/chromadb
# Run the startup script
CMD ["./startup.sh"]

37
app.py Normal file
View File

@@ -0,0 +1,37 @@
import os
from flask import Flask, request, jsonify, render_template, send_from_directory
from main import consult_simba_oracle
app = Flask(__name__, static_folder="raggr-frontend/dist/static", template_folder="raggr-frontend/dist")
# Serve React static files
@app.route('/static/<path:filename>')
def static_files(filename):
return send_from_directory(app.static_folder, filename)
# Serve the React app for all routes (catch-all)
@app.route('/', defaults={'path': ''})
@app.route('/<path:path>')
def serve_react_app(path):
if path and os.path.exists(os.path.join(app.template_folder, path)):
return send_from_directory(app.template_folder, path)
return render_template('index.html')
@app.route("/api/query", methods=["POST"])
def query():
data = request.get_json()
query = data.get("query")
return jsonify({"response": consult_simba_oracle(query)})
@app.route("/api/ingest", methods=["POST"])
def webhook():
data = request.get_json()
print(data)
return jsonify({"status": "received"})
if __name__ == "__main__":
app.run(host="0.0.0.0", port=8080, debug=True)

View File

@@ -4,8 +4,8 @@ import re
from typing import Union
from uuid import UUID, uuid4
from chromadb.utils.embedding_functions.ollama_embedding_function import (
OllamaEmbeddingFunction,
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
from dotenv import load_dotenv
@@ -80,9 +80,9 @@ class Chunk:
class Chunker:
embedding_fx = OllamaEmbeddingFunction(
url=os.getenv("OLLAMA_URL", ""),
model_name="mxbai-embed-large",
embedding_fx = OpenAIEmbeddingFunction(
api_key=os.getenv("OPENAI_API_KEY"),
model_name="text-embedding-3-small",
)
def __init__(self, collection) -> None:
@@ -96,7 +96,7 @@ class Chunker:
) -> list[Chunk]:
doc_uuid = uuid4()
chunk_size = min(chunk_size, len(document))
chunk_size = min(chunk_size, len(document)) or 1
chunks = []
num_chunks = ceil(len(document) / chunk_size)

View File

@@ -12,6 +12,9 @@ from request import PaperlessNGXService
load_dotenv()
# Configure ollama client with URL from environment or default to localhost
ollama_client = ollama.Client(host=os.getenv("OLLAMA_URL", "http://localhost:11434"))
parser = argparse.ArgumentParser(description="use llm to clean documents")
parser.add_argument("document_id", type=str, help="questions about simba's health")
@@ -131,7 +134,7 @@ Someone will kill the innocent kittens if you don't extract the text exactly. So
def summarize_pdf_image(filepaths: list[str]):
res = ollama.chat(
res = ollama_client.chat(
model="gemma3:4b",
messages=[
{

17
docker-compose.yml Normal file
View File

@@ -0,0 +1,17 @@
version: '3.8'
services:
raggr:
image: torrtle/simbarag:latest
network_mode: host
environment:
- PAPERLESS_TOKEN=${PAPERLESS_TOKEN}
- BASE_URL=${BASE_URL}
- OLLAMA_URL=${OLLAMA_URL:-http://localhost:11434}
- CHROMADB_PATH=/app/chromadb
- OPENAI_API_KEY=${OPENAI_API_KEY}
volumes:
- chromadb_data:/app/chromadb
volumes:
chromadb_data:

116
main.py
View File

@@ -6,6 +6,7 @@ from typing import Any, Union
import argparse
import chromadb
import ollama
from openai import OpenAI
from request import PaperlessNGXService
@@ -17,6 +18,9 @@ from dotenv import load_dotenv
load_dotenv()
# Configure ollama client with URL from environment or default to localhost
ollama_client = ollama.Client(host=os.getenv("OLLAMA_URL", "http://localhost:11434"))
client = chromadb.PersistentClient(path=os.getenv("CHROMADB_PATH", ""))
simba_docs = client.get_or_create_collection(name="simba_docs")
feline_vet_lookup = client.get_or_create_collection(name="feline_vet_lookup")
@@ -29,9 +33,13 @@ parser.add_argument("query", type=str, help="questions about simba's health")
parser.add_argument(
"--reindex", action="store_true", help="re-index the simba documents"
)
parser.add_argument(
"--index", help="index a file"
)
ppngx = PaperlessNGXService()
openai_client = OpenAI()
def index_using_pdf_llm():
files = ppngx.get_data()
@@ -39,6 +47,7 @@ def index_using_pdf_llm():
document_id = file["id"]
pdf_path = ppngx.download_pdf_from_id(id=document_id)
image_paths = pdf_to_image(filepath=pdf_path)
print(f"summarizing {file}")
generated_summary = summarize_pdf_image(filepaths=image_paths)
file["content"] = generated_summary
@@ -68,36 +77,78 @@ def chunk_data(docs: list[dict[str, Union[str, Any]]], collection):
print(docs)
texts: list[str] = [doc["content"] for doc in docs]
for index, text in enumerate(texts):
print(docs[index]["original_file_name"])
metadata = {
"created_date": date_to_epoch(docs[index]["created_date"]),
"created_date": date_to_epoch(docs[index]["created_date"]),
"filename": docs[index]["original_file_name"]
}
chunker.chunk_document(
document=text,
metadata=metadata,
)
def chunk_text(texts: list[str], collection):
chunker = Chunker(collection)
for index, text in enumerate(texts):
metadata = {}
chunker.chunk_document(
document=text,
metadata=metadata,
)
def consult_oracle(input: str, collection):
print(input)
import time
start_time = time.time()
# Ask
qg = QueryGenerator()
metadata_filter = qg.get_query("input")
print(metadata_filter)
# print("Starting query generation")
# qg_start = time.time()
# qg = QueryGenerator()
# metadata_filter = qg.get_query(input)
# qg_end = time.time()
# print(f"Query generation took {qg_end - qg_start:.2f} seconds")
# print(metadata_filter)
print("Starting embedding generation")
embedding_start = time.time()
embeddings = Chunker.embedding_fx(input=[input])
embedding_end = time.time()
print(f"Embedding generation took {embedding_end - embedding_start:.2f} seconds")
print("Starting collection query")
query_start = time.time()
results = collection.query(
query_texts=[input],
query_embeddings=embeddings,
where=metadata_filter,
#where=metadata_filter,
)
print(results)
query_end = time.time()
print(f"Collection query took {query_end - query_start:.2f} seconds")
# Generate
output = ollama.generate(
model="gemma3n:e4b",
prompt=f"You are a helpful assistant that understandings veterinary terms. Using the following data, help answer the user's query by providing as many details as possible. Using this data: {results}. Respond to this prompt: {input}",
print("Starting LLM generation")
llm_start = time.time()
# output = ollama_client.generate(
# model="gemma3n:e4b",
# prompt=f"You are a helpful assistant that understandings veterinary terms. Using the following data, help answer the user's query by providing as many details as possible. Using this data: {results}. Respond to this prompt: {input}",
# )
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant that understands veterinary terms."},
{"role": "user", "content": f"Using the following data, help answer the user's query by providing as many details as possible. Using this data: {results}. Respond to this prompt: {input}"}
]
)
llm_end = time.time()
print(f"LLM generation took {llm_end - llm_start:.2f} seconds")
print(output["response"])
total_time = time.time() - start_time
print(f"Total consult_oracle execution took {total_time:.2f} seconds")
return response.choices[0].message.content
def paperless_workflow(input):
@@ -109,24 +160,47 @@ def paperless_workflow(input):
consult_oracle(input, simba_docs)
def consult_simba_oracle(input: str):
return consult_oracle(
input=input,
collection=simba_docs,
)
if __name__ == "__main__":
args = parser.parse_args()
if args.reindex:
# logging.info(msg="Fetching documents from Paperless-NGX")
# ppngx = PaperlessNGXService()
# docs = ppngx.get_data()
# logging.info(msg=f"Fetched {len(docs)} documents")
print("Fetching documents from Paperless-NGX")
ppngx = PaperlessNGXService()
docs = ppngx.get_data()
print(docs)
print(f"Fetched {len(docs)} documents")
#
# logging.info(msg="Chunking documents now ...")
# chunk_data(docs, collection=simba_docs)
# logging.info(msg="Done chunking documents")
index_using_pdf_llm()
print("Chunking documents now ...")
chunk_data(docs, collection=simba_docs)
print("Done chunking documents")
# index_using_pdf_llm()
if args.index:
with open(args.index) as file:
extension = args.index.split(".")[-1]
if extension == "pdf":
pdf_path = ppngx.download_pdf_from_id(id=document_id)
image_paths = pdf_to_image(filepath=pdf_path)
print(f"summarizing {file}")
generated_summary = summarize_pdf_image(filepaths=image_paths)
elif extension in [".md", ".txt"]:
chunk_text(texts=[file.readall()], collection=simba_docs)
if args.query:
logging.info("Consulting oracle ...")
consult_oracle(
print("Consulting oracle ...")
print(consult_oracle(
input=args.query,
collection=simba_docs,
)
))
else:
print("please provide a query")

View File

@@ -4,4 +4,14 @@ version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.13"
dependencies = []
dependencies = [
"chromadb>=1.1.0",
"python-dotenv>=1.0.0",
"flask>=3.1.2",
"httpx>=0.28.1",
"ollama>=0.6.0",
"openai>=2.0.1",
"pydantic>=2.11.9",
"pillow>=10.0.0",
"pymupdf>=1.24.0",
]

View File

@@ -1,10 +1,16 @@
import json
import os
from typing import Literal
import datetime
from ollama import chat, ChatResponse
from ollama import chat, ChatResponse, Client
from openai import OpenAI
from pydantic import BaseModel, Field
# Configure ollama client with URL from environment or default to localhost
ollama_client = Client(host=os.getenv("OLLAMA_URL", "http://localhost:11434"))
# This uses inferred filters — which means using LLM to create the metadata filters
@@ -27,11 +33,15 @@ class GeneratedQuery(BaseModel):
fields: list[str]
extracted_metadata_fields: str
class Time(BaseModel):
time: int
PROMPT = """
You are an information specialist that processes user queries. The current year is 2025. The user queries are all about
a cat, Simba, and its records. The types of records are listed below. Using the query, extract the
the date range the user is trying to query. You should return the it as a JSON. The date tag is created_date. Return the date in epoch time
the date range the user is trying to query. You should return it as a JSON. The date tag is created_date. Return the date in epoch time.
If the created_date cannot be ascertained, set it to epoch time start.
You have several operators at your disposal:
@@ -90,18 +100,31 @@ class QueryGenerator:
return date.timestamp()
def get_query(self, input: str):
response: ChatResponse = chat(
model="gemma3n:e4b",
messages=[
client = OpenAI()
print(input)
response = client.responses.parse(
model="gpt-4o",
input=[
{"role": "system", "content": PROMPT},
{"role": "user", "content": input},
],
format=GeneratedQuery.model_json_schema(),
text_format=Time,
)
print(response)
query = json.loads(response.output_parsed.extracted_metadata_fields)
query = json.loads(
json.loads(response["message"]["content"])["extracted_metadata_fields"]
)
# response: ChatResponse = ollama_client.chat(
# model="gemma3n:e4b",
# messages=[
# {"role": "system", "content": PROMPT},
# {"role": "user", "content": input},
# ],
# format=GeneratedQuery.model_json_schema(),
# )
# query = json.loads(
# json.loads(response["message"]["content"])["extracted_metadata_fields"]
# )
date_key = list(query["created_date"].keys())[0]
query["created_date"][date_key] = self.date_to_epoch(
query["created_date"][date_key]

16
raggr-frontend/.gitignore vendored Normal file
View File

@@ -0,0 +1,16 @@
# Local
.DS_Store
*.local
*.log*
# Dist
node_modules
dist/
# Profile
.rspack-profile-*/
# IDE
.vscode/*
!.vscode/extensions.json
.idea

36
raggr-frontend/README.md Normal file
View File

@@ -0,0 +1,36 @@
# Rsbuild project
## Setup
Install the dependencies:
```bash
pnpm install
```
## Get started
Start the dev server, and the app will be available at [http://localhost:3000](http://localhost:3000).
```bash
pnpm dev
```
Build the app for production:
```bash
pnpm build
```
Preview the production build locally:
```bash
pnpm preview
```
## Learn more
To learn more about Rsbuild, check out the following resources:
- [Rsbuild documentation](https://rsbuild.rs) - explore Rsbuild features and APIs.
- [Rsbuild GitHub repository](https://github.com/web-infra-dev/rsbuild) - your feedback and contributions are welcome!

View File

@@ -0,0 +1,26 @@
{
"name": "raggr-frontend",
"version": "1.0.0",
"private": true,
"type": "module",
"scripts": {
"build": "rsbuild build",
"dev": "rsbuild dev --open",
"preview": "rsbuild preview"
},
"dependencies": {
"axios": "^1.12.2",
"marked": "^16.3.0",
"react": "^19.1.1",
"react-dom": "^19.1.1",
"react-markdown": "^10.1.0"
},
"devDependencies": {
"@rsbuild/core": "^1.5.6",
"@rsbuild/plugin-react": "^1.4.0",
"@tailwindcss/postcss": "^4.0.0",
"@types/react": "^19.1.13",
"@types/react-dom": "^19.1.9",
"typescript": "^5.9.2"
}
}

View File

@@ -0,0 +1,5 @@
export default {
plugins: {
"@tailwindcss/postcss": {},
},
};

View File

@@ -0,0 +1,6 @@
import { defineConfig } from '@rsbuild/core';
import { pluginReact } from '@rsbuild/plugin-react';
export default defineConfig({
plugins: [pluginReact()],
});

View File

@@ -0,0 +1,6 @@
@import "tailwindcss";
body {
margin: 0;
font-family: Inter, Avenir, Helvetica, Arial, sans-serif;
}

View File

@@ -0,0 +1,66 @@
import { useState } from "react";
import axios from "axios";
import ReactMarkdown from "react-markdown";
import "./App.css";
const App = () => {
const [query, setQuery] = useState<string>("");
const [answer, setAnswer] = useState<string>("");
const [loading, setLoading] = useState<boolean>(false);
const handleQuestionSubmit = () => {
const payload = { query: query };
setLoading(true);
axios
.post("/api/query", payload)
.then((result) => setAnswer(result.data.response))
.finally(() => setLoading(false));
};
const handleQueryChange = (event) => {
setQuery(event.target.value);
};
return (
<div className="flex flex-row justify-center py-4">
<div className="flex flex-col gap-4 min-w-xl max-w-xl">
<div className="flex flex-row justify-center gap-2 grow">
<h1 className="text-3xl">ask simba!</h1>
</div>
<div className="flex flex-row justify-between gap-2 grow">
<textarea
type="text"
className="p-4 border border-blue-200 rounded-md grow"
onChange={handleQueryChange}
/>
</div>
<div className="flex flex-row justify-between gap-2 grow">
<button
className="p-4 border border-blue-400 bg-blue-200 hover:bg-blue-400 cursor-pointer rounded-md flex-grow"
onClick={() => handleQuestionSubmit()}
type="submit"
>
Submit
</button>
</div>
{loading ? (
<div className="flex flex-col w-full animate-pulse gap-2">
<div className="flex flex-row gap-2 w-full">
<div className="bg-gray-400 w-1/2 p-3 rounded-lg" />
<div className="bg-gray-400 w-1/2 p-3 rounded-lg" />
</div>
<div className="flex flex-row gap-2 w-full">
<div className="bg-gray-400 w-1/3 p-3 rounded-lg" />
<div className="bg-gray-400 w-2/3 p-3 rounded-lg" />
</div>
</div>
) : (
<div className="flex flex-col">
<ReactMarkdown>{answer}</ReactMarkdown>
</div>
)}
</div>
</div>
);
};
export default App;

11
raggr-frontend/src/env.d.ts vendored Normal file
View File

@@ -0,0 +1,11 @@
/// <reference types="@rsbuild/core/types" />
/**
* Imports the SVG file as a React component.
* @requires [@rsbuild/plugin-svgr](https://npmjs.com/package/@rsbuild/plugin-svgr)
*/
declare module '*.svg?react' {
import type React from 'react';
const ReactComponent: React.FunctionComponent<React.SVGProps<SVGSVGElement>>;
export default ReactComponent;
}

View File

@@ -0,0 +1,13 @@
import React from 'react';
import ReactDOM from 'react-dom/client';
import App from './App';
const rootEl = document.getElementById('root');
if (rootEl) {
const root = ReactDOM.createRoot(rootEl);
root.render(
<React.StrictMode>
<App />
</React.StrictMode>,
);
}

View File

@@ -0,0 +1,25 @@
{
"compilerOptions": {
"lib": ["DOM", "ES2020"],
"jsx": "react-jsx",
"target": "ES2020",
"noEmit": true,
"skipLibCheck": true,
"useDefineForClassFields": true,
/* modules */
"module": "ESNext",
"moduleDetection": "force",
"moduleResolution": "bundler",
"verbatimModuleSyntax": true,
"resolveJsonModule": true,
"allowImportingTsExtensions": true,
"noUncheckedSideEffectImports": true,
/* type checking */
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true
},
"include": ["src"]
}

1424
raggr-frontend/yarn.lock Normal file

File diff suppressed because it is too large Load Diff

7
startup.sh Normal file
View File

@@ -0,0 +1,7 @@
#!/bin/bash
echo "Starting reindex process..."
python main.py "" --reindex
echo "Starting Flask application..."
python app.py

1719
uv.lock generated Normal file

File diff suppressed because it is too large Load Diff