Replace Ollama with llama-server (OpenAI-compatible API)

- Update llm.py to use OpenAI client with custom base_url for llama-server - Update agents.py to use ChatOpenAI instead of ChatOllama - Remove unused ollama imports from main.py, chunker.py, query.py - Add LLAMA_SERVER_URL and LLAMA_MODEL_NAME env vars - Remove ollama and langchain-ollama dependencies Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 21:39:23 -05:00
parent 713a058c4f
commit 32020a6c60
7 changed files with 35 additions and 71 deletions
@@ -14,9 +14,10 @@ JWT_SECRET_KEY=your-secret-key-here
 PAPERLESS_TOKEN=your-paperless-token
 BASE_URL=192.168.1.5:8000

-# Ollama Configuration
-OLLAMA_URL=http://192.168.1.14:11434
-OLLAMA_HOST=http://192.168.1.14:11434
+# llama-server Configuration (OpenAI-compatible API)
+# If set, uses llama-server as the primary LLM backend with OpenAI as fallback
+LLAMA_SERVER_URL=http://192.168.1.213:8080/v1
+LLAMA_MODEL_NAME=llama-3.1-8b-instruct

 # ChromaDB Configuration
 # For Docker: This is automatically set to /app/data/chromadb