Files
simbarag/.planning/codebase/STRUCTURE.md
Ryan Chen b0b02d24f4 docs: map existing codebase
- STACK.md - Technologies and dependencies
- ARCHITECTURE.md - System design and patterns
- STRUCTURE.md - Directory layout
- CONVENTIONS.md - Code style and patterns
- TESTING.md - Test structure
- INTEGRATIONS.md - External services
- CONCERNS.md - Technical debt and issues
2026-02-04 16:53:27 -05:00

238 lines
9.0 KiB
Markdown

# Codebase Structure
**Analysis Date:** 2026-02-04
## Directory Layout
```
raggr/
├── blueprints/ # API route modules (Quart blueprints)
│ ├── conversation/ # Chat conversation endpoints and logic
│ ├── rag/ # Document indexing and retrieval endpoints
│ └── users/ # Authentication and user management
├── config/ # Configuration modules
├── utils/ # Reusable service clients and utilities
├── scripts/ # Administrative CLI scripts
├── migrations/ # Database schema migrations (Aerich)
├── raggr-frontend/ # React SPA frontend
│ ├── src/
│ │ ├── components/ # React UI components
│ │ ├── api/ # Frontend API service clients
│ │ ├── contexts/ # React contexts (Auth)
│ │ └── assets/ # Static images
│ └── dist/ # Built frontend (served by backend)
├── chroma_db/ # ChromaDB persistent vector store
├── chromadb/ # Alternate ChromaDB path (legacy)
├── docs/ # Documentation files
├── app.py # Quart application entry point
├── main.py # RAG logic and CLI entry point
├── llm.py # LLM client with provider fallback
└── aerich_config.py # Database migration configuration
```
## Directory Purposes
**blueprints/**
- Purpose: API route organization using Quart blueprint pattern
- Contains: Python packages with `__init__.py` (routes), `models.py` (ORM), `logic.py` (business logic)
- Key files: `conversation/__init__.py` (chat API), `rag/__init__.py` (indexing API), `users/__init__.py` (auth API)
**blueprints/conversation/**
- Purpose: Chat conversation management
- Contains: Streaming chat endpoints, message persistence, conversation CRUD, agent orchestration
- Key files: `__init__.py` (endpoints), `agents.py` (LangChain agent + tools), `logic.py` (conversation operations), `models.py` (Conversation, ConversationMessage)
**blueprints/rag/**
- Purpose: Document indexing and vector search
- Contains: Admin-only indexing endpoints, vector store operations, Paperless-NGX integration
- Key files: `__init__.py` (endpoints), `logic.py` (indexing + query), `fetchers.py` (Paperless client)
**blueprints/users/**
- Purpose: User authentication and authorization
- Contains: OIDC login flow, JWT token management, RBAC decorators
- Key files: `__init__.py` (auth endpoints), `models.py` (User model), `decorators.py` (@admin_required), `oidc_service.py` (user provisioning)
**config/**
- Purpose: Configuration modules for external integrations
- Contains: OIDC configuration with JWKS verification
- Key files: `oidc_config.py`
**utils/**
- Purpose: Reusable utilities and external service clients
- Contains: Chunking, cleaning, API clients for YNAB/Mealie/Paperless
- Key files: `chunker.py`, `cleaner.py`, `ynab_service.py`, `mealie_service.py`, `request.py` (Paperless client), `image_process.py`
**scripts/**
- Purpose: Administrative and maintenance CLI tools
- Contains: User management, statistics, vector store inspection
- Key files: `add_user.py`, `user_message_stats.py`, `manage_vectorstore.py`, `inspect_vector_store.py`, `query.py`
**migrations/**
- Purpose: Database schema version control (Aerich/Tortoise ORM)
- Contains: SQL migration files generated by `aerich migrate`
- Generated: Yes
- Committed: Yes
**raggr-frontend/**
- Purpose: React single-page application
- Contains: React 19 components, Rsbuild bundler config, Tailwind CSS, TypeScript
- Key files: `src/App.tsx` (root), `src/index.tsx` (entry), `src/components/ChatScreen.tsx` (main UI)
**raggr-frontend/src/components/**
- Purpose: React UI components
- Contains: Chat interface, login, conversation list, message bubbles
- Key files: `ChatScreen.tsx`, `LoginScreen.tsx`, `ConversationList.tsx`, `AnswerBubble.tsx`, `QuestionBubble.tsx`, `MessageInput.tsx`
**raggr-frontend/src/api/**
- Purpose: Frontend service layer for API communication
- Contains: TypeScript service clients with axios/fetch
- Key files: `conversationService.ts` (SSE streaming), `userService.ts`, `oidcService.ts`
**raggr-frontend/src/contexts/**
- Purpose: React contexts for global state
- Contains: Authentication context
- Key files: `AuthContext.tsx`
**raggr-frontend/dist/**
- Purpose: Built frontend assets served by backend
- Contains: Bundled JS, CSS, HTML
- Generated: Yes (by Rsbuild)
- Committed: No
**chroma_db/** and **chromadb/**
- Purpose: ChromaDB persistent vector store data
- Contains: SQLite database files and vector indices
- Generated: Yes (at runtime)
- Committed: No
**docs/**
- Purpose: Project documentation
- Contains: Integration documentation, technical specs
- Key files: `ynab_integration/`
## Key File Locations
**Entry Points:**
- `app.py`: Web server entry point (Quart application)
- `main.py`: CLI entry point for RAG operations
- `raggr-frontend/src/index.tsx`: Frontend entry point
**Configuration:**
- `.env`: Environment variables (not committed, see `.env.example`)
- `aerich_config.py`: Database migration configuration
- `config/oidc_config.py`: OIDC authentication configuration
- `raggr-frontend/rsbuild.config.ts`: Frontend build configuration
**Core Logic:**
- `blueprints/conversation/agents.py`: LangChain agent with tool definitions
- `blueprints/rag/logic.py`: Vector store indexing and query operations
- `main.py`: Original RAG implementation (legacy, partially superseded by blueprints)
- `llm.py`: LLM client abstraction with fallback logic
**Testing:**
- Not detected (no test files found)
## Naming Conventions
**Files:**
- Snake_case for Python modules: `ynab_service.py`, `oidc_config.py`
- PascalCase for React components: `ChatScreen.tsx`, `AnswerBubble.tsx`
- Lowercase for config files: `docker-compose.yml`, `pyproject.toml`
**Directories:**
- Lowercase with underscores for Python packages: `blueprints/conversation/`, `utils/`
- Kebab-case for frontend: `raggr-frontend/`
**Python Classes:**
- PascalCase: `User`, `Conversation`, `ConversationMessage`, `LLMClient`, `YNABService`
**Python Functions:**
- Snake_case: `get_conversation_by_id`, `query_vector_store`, `add_message_to_conversation`
**React Components:**
- PascalCase: `ChatScreen`, `LoginScreen`, `ConversationList`
**API Routes:**
- Kebab-case: `/api/conversation/query`, `/api/user/oidc/callback`
**Environment Variables:**
- SCREAMING_SNAKE_CASE: `DATABASE_URL`, `YNAB_ACCESS_TOKEN`, `LLAMA_SERVER_URL`
## Where to Add New Code
**New API Endpoint:**
- Primary code: Create or extend blueprint in `blueprints/<domain>/__init__.py`
- Business logic: Add functions to `blueprints/<domain>/logic.py`
- Database models: Add to `blueprints/<domain>/models.py`
- Tests: Not established (no test directory exists)
**New LangChain Tool:**
- Implementation: Add `@tool` decorated function in `blueprints/conversation/agents.py`
- Service client: If calling external API, create client in `utils/<service>_service.py`
- Add to tools list: Append to `tools` list at bottom of `agents.py` (line 709+)
**New External Service Integration:**
- Service client: Create `utils/<service>_service.py` with async methods
- Tool wrapper: Add tool function in `blueprints/conversation/agents.py`
- Configuration: Add env vars to `.env.example`
**New React Component:**
- Component file: `raggr-frontend/src/components/<ComponentName>.tsx`
- API service: If needs backend, add methods to `raggr-frontend/src/api/<domain>Service.ts`
- Import in: `raggr-frontend/src/App.tsx` or parent component
**New Database Table:**
- Model: Add Tortoise model to `blueprints/<domain>/models.py`
- Migration: Run `docker compose -f docker-compose.dev.yml exec raggr aerich migrate --name <description>`
- Apply: Run `docker compose -f docker-compose.dev.yml exec raggr aerich upgrade` (or restart container)
**Utilities:**
- Shared helpers: `utils/<utility_name>.py` for Python utilities
- Frontend utilities: `raggr-frontend/src/utils/` (not currently used, would need creation)
## Special Directories
**.git/**
- Purpose: Git version control metadata
- Generated: Yes
- Committed: No (automatically handled by git)
**.venv/**
- Purpose: Python virtual environment
- Generated: Yes (local dev only)
- Committed: No
**node_modules/**
- Purpose: NPM dependencies for frontend
- Generated: Yes (npm/yarn install)
- Committed: No
**__pycache__/**
- Purpose: Python bytecode cache
- Generated: Yes (Python runtime)
- Committed: No
**.planning/**
- Purpose: GSD (Get Stuff Done) codebase documentation
- Generated: Yes (by GSD commands)
- Committed: Yes (intended for project documentation)
**.claude/**
- Purpose: Claude Code session data
- Generated: Yes
- Committed: No
**.ruff_cache/**
- Purpose: Ruff linter cache
- Generated: Yes
- Committed: No
**.ropeproject/**
- Purpose: Rope Python refactoring library cache
- Generated: Yes
- Committed: No
---
*Structure analysis: 2026-02-04*