Replace Ollama with llama-server (OpenAI-compatible API)

- Update llm.py to use OpenAI client with custom base_url for llama-server - Update agents.py to use ChatOpenAI instead of ChatOllama - Remove unused ollama imports from main.py, chunker.py, query.py - Add LLAMA_SERVER_URL and LLAMA_MODEL_NAME env vars - Remove ollama and langchain-ollama dependencies Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 21:39:23 -05:00
parent 713a058c4f
commit 32020a6c60
7 changed files with 35 additions and 71 deletions
--- a/scripts/query.py
+++ b/scripts/query.py
@@ -1,18 +1,11 @@
 import json
-import os
 from typing import Literal
 import datetime
-from ollama import Client

 from openai import OpenAI

 from pydantic import BaseModel, Field

-# Configure ollama client with URL from environment or default to localhost
-ollama_client = Client(
-    host=os.getenv("OLLAMA_URL", "http://localhost:11434"), timeout=10.0
-)
-
 # This uses inferred filters — which means using LLM to create the metadata filters