Disable tiktoken pre-encoding for custom embedding servers. LangChain
was encoding text into OpenAI token IDs then sending them to llama-server
which has a different vocabulary, causing "invalid tokens" errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Indexes chunks one at a time with error logging to identify which
document/chunk causes embedding failures. Also strips Unicode surrogates
and replacement characters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strips null bytes, control characters, and excessive whitespace from
document content before sending to embedding models. Fixes 400 errors
from BERT-based tokenizers (e.g. nomic-embed-text) on PDF-extracted text.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds EMBEDDING_SERVER_URL and EMBEDDING_MODEL_NAME env vars, mirroring
the existing LLAMA_SERVER_URL pattern for LLM configuration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract logic from god components into custom hooks (useAuthCheck,
useConversations, useChat, usePresignedUrl, useAdminUsers, useOIDCAuth).
Eliminate unnecessary useEffects per React guidelines — scroll is now
imperative, isAdmin comes from useAuthCheck instead of a separate fetch.
ConversationList becomes a pure presentational component. Wrap bubble
components in React.memo.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Shift focus from cat persona to genuine helpfulness. Keep light
cat flavor but prioritize thorough, detailed answers over the
assertive cat act.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_get_collection_id now catches the UndefinedTable error that occurs
before the first index operation creates the langchain tables.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consolidate onto PostgreSQL by using pgvector instead of a separate
ChromaDB instance. This removes a Docker volume, a large dependency,
and simplifies the stack without meaningful performance impact at
our document scale.
- Swap langchain-chroma for langchain-postgres (PGVector)
- Use pgvector/pgvector:pg16 Docker image with init script
- Lazy-initialize vector store to avoid eager DB connections
- Add SQL helpers for stats/delete/list (replacing _collection access)
- Remove legacy main.py, chunker, petmd scraper, and /api/query endpoint
Re-index required after deploy (POST /api/rag/index + /index-obsidian).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the useEffect on selectedConversation.id that race-conditions
with handleQuestionSubmit — it fetches the (still-empty) conversation
and wipes messages, sending the user back to the empty state. Refresh
conversation list after streaming completes instead to pick up the
auto-generated title.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Give the LangChain agent a save_user_memory tool so users can ask it to
remember preferences and personal facts. Memories are stored per-user in
a new user_memories table and injected into the system prompt on each
conversation turn.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Conversations are now returned sorted by most recently updated first.
New conversations are named using the first 100 characters of the
user's initial message instead of a username+timestamp placeholder.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use `claims.get("groups") or []` instead of `claims.get("groups", [])`
so that an explicit `null` value is coerced to an empty list, preventing
a ValueError on the non-nullable ldap_groups field.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Access tokens now last 1 hour (up from default 15 min) and refresh
tokens last 30 days, reducing frequent re-authentication.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Memoize blob URL creation to prevent leak on every keystroke, wrap
MessageInput in React.memo with stable useCallback props, remove
expensive backdrop-blur-sm from chat footer, and use instant scroll
during streaming to avoid queuing smooth scroll animations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add missing pendingImage, onImageSelect, and onClearImage props to the
MessageInput rendered in the active chat footer, matching the homepage version.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove unused image_url from upload response and TS type
- Remove bare except in serve_image that masked config errors as 404s
- Add error state and broken-image placeholder in QuestionBubble
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Browser <img> tags can't attach JWT headers, causing 401s. The image
endpoint now returns a time-limited presigned S3 URL via authenticated
API call, which the frontend fetches and uses directly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use the same llama-server (OpenAI-compatible API) for vision analysis
that the main agent uses, with OpenAI fallback. Sends images as base64
in the standard OpenAI vision message format.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Users can now attach images in the web chat for Simba to analyze using
Ollama's gemma3 vision model. Images are stored in Garage (S3-compatible)
and displayed in chat history.
Also fixes aerich migration config by extracting TORTOISE_CONFIG into a
standalone config/db.py module, removing the stale aerich_config.py, and
adding missing MODELS_STATE to migration 3.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The get_transactions() method was truncating results to 50 by default,
causing incomplete transaction data. The YNAB API returns all matching
transactions in a single response, so this limit was unnecessary and
caused count/total inconsistencies.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds manifest.json, service worker with static asset caching,
resized cat icons, and meta tags for iOS/Android installability.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move blueprints.email.helpers import from module-level to inside the
endpoint functions that use it, breaking the circular dependency chain.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Users can now receive a unique email address (ask+<token>@domain) and
interact with Simba by sending emails. Inbound emails hit a Mailgun
webhook, are authenticated via HMAC token lookup, processed through the
LangChain agent, and replied to via the Mailgun API.
- Extract shared SIMBA_SYSTEM_PROMPT to blueprints/conversation/prompts.py
- Add email_enabled and email_hmac_token fields to User model
- Create blueprints/email with webhook, signature validation, rate limiting
- Add admin endpoints to enable/disable email per user
- Update AdminPanel with Email column, toggle, and copy-address button
- Add Mailgun env vars to .env.example
- Include database migration for new fields
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>