simbarag

Files

T

ryan abb06b78e2 Sanitize document text before embedding to fix tokenizer errors

Strips null bytes, control characters, and excessive whitespace from
document content before sending to embedding models. Fixes 400 errors
from BERT-based tokenizers (e.g. nomic-embed-text) on PDF-extracted text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-05-11 23:35:25 -04:00

conversation

Improve Simba system prompt for more helpful responses

2026-04-24 08:58:29 -04:00

Merge origin/main: resolve conflicts keeping both email/Mealie and WhatsApp/Mailgun/Obsidian work

2026-04-04 08:19:50 -04:00

rag

Sanitize document text before embedding to fix tokenizer errors

2026-05-11 23:35:25 -04:00

users

Fix OIDC login crash when groups claim is null

2026-04-05 10:12:12 -04:00

Add email channel via Mailgun for Ask Simba

2026-03-13 16:21:18 -04:00

__init__.py

reorganization

2026-01-31 17:13:27 -05:00