yottob/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

`yottob` is a Flask-based web application for processing YouTube RSS feeds with SQLAlchemy ORM persistence. The project provides both a REST API and CLI interface for fetching and parsing YouTube channel feeds, with filtering logic to exclude YouTube Shorts. All fetched feeds are automatically saved to a SQLite database for historical tracking.

## Development Setup

This project uses `uv` for dependency management.

**Install dependencies:**
```bash
uv sync
```

**Activate virtual environment:**
```bash
source .venv/bin/activate  # On macOS/Linux
```

**Initialize/update database:**
```bash
# Run migrations to create or update database schema
source .venv/bin/activate && alembic upgrade head
```

## Running the Application

**Run the CLI feed parser:**
```bash
python main.py
```
This executes the `main()` function which fetches and parses a YouTube channel RSS feed for testing.

**Run the Flask web application:**
```bash
flask --app main run
```
The web server exposes:
- `/` - Main page (renders `index.html`)
- `/api/feed` - API endpoint for fetching feeds and saving to database
- `/api/channels` - List all tracked channels
- `/api/history/<channel_id>` - Get video history for a specific channel

**API Usage Examples:**
```bash
# Fetch default channel feed (automatically saves to DB)
curl http://localhost:5000/api/feed

# Fetch specific channel with options
curl "http://localhost:5000/api/feed?channel_id=CHANNEL_ID&filter_shorts=false&save=true"

# List all tracked channels
curl http://localhost:5000/api/channels

# Get video history for a channel (limit 20 videos)
curl "http://localhost:5000/api/history/CHANNEL_ID?limit=20"
```

## Architecture

The codebase follows a clean layered architecture with separation of concerns:

### Database Layer
**`models.py`** - SQLAlchemy ORM models
- `Base`: Declarative base for all models
- `Channel`: Stores YouTube channel metadata (channel_id, title, link, last_fetched)
- `VideoEntry`: Stores individual video entries with foreign key to Channel
- Relationships: One Channel has many VideoEntry records

**`database.py`** - Database configuration and session management
- `DATABASE_URL`: SQLite database location (yottob.db)
- `engine`: SQLAlchemy engine instance
- `init_db()`: Creates all tables
- `get_db_session()`: Context manager for database sessions

### Core Logic Layer
**`feed_parser.py`** - Reusable YouTube feed parsing module
- `YouTubeFeedParser`: Main parser class that encapsulates channel-specific logic
- `FeedEntry`: In-memory data model for feed entries
- `fetch_feed()`: Fetches and parses RSS feeds
- `save_to_db()`: Persists feed data to database with upsert logic
- Independent of Flask - can be imported and used in any Python context

### Web Server Layer
**`main.py`** - Flask application and routes
- `app`: Flask application instance (main.py:9)
- Database initialization on startup (main.py:16)
- `index()`: Homepage route handler (main.py:20)
- `get_feed()`: REST API endpoint (main.py:26) that fetches and saves to DB
- `get_channels()`: Lists all tracked channels (main.py:59)
- `get_history()`: Returns video history for a channel (main.py:86)
- `main()`: CLI entry point for testing (main.py:132)

### Templates
**`templates/index.html`** - Frontend HTML (currently static placeholder)

## Feed Parsing Implementation

The `YouTubeFeedParser` class in `feed_parser.py`:
- Constructs YouTube RSS feed URLs from channel IDs
- Uses feedparser to fetch and parse feeds
- Validates HTTP 200 status before processing
- Optionally filters out YouTube Shorts (any entry with "shorts" in URL)
- Returns structured dictionary with feed metadata and entries

**YouTube RSS Feed URL Format:**
```
https://www.youtube.com/feeds/videos.xml?channel_id={CHANNEL_ID}
```

## Database Migrations

This project uses Alembic for database schema migrations.

**Create a new migration after model changes:**
```bash
source .venv/bin/activate && alembic revision --autogenerate -m "Description of changes"
```

**Apply migrations:**
```bash
source .venv/bin/activate && alembic upgrade head
```

**View migration history:**
```bash
source .venv/bin/activate && alembic history
```

**Rollback to previous version:**
```bash
source .venv/bin/activate && alembic downgrade -1
```

**Migration files location:** `alembic/versions/`

**Important notes:**
- Always review auto-generated migrations before applying
- The database is automatically initialized on Flask app startup via `init_db()`
- Migration configuration is in `alembic.ini` and `alembic/env.py`
- Models are imported in `alembic/env.py` for autogenerate support

## Database Schema

**channels table:**
- `id`: Primary key
- `channel_id`: YouTube channel ID (unique, indexed)
- `title`: Channel title
- `link`: Channel URL
- `last_fetched`: Timestamp of last feed fetch

**video_entries table:**
- `id`: Primary key
- `channel_id`: Foreign key to channels.id
- `title`: Video title
- `link`: Video URL (unique)
- `created_at`: Timestamp when video was first recorded
- Index: `idx_channel_created` on (channel_id, created_at) for fast queries

## Dependencies

- **Flask 3.1.2+**: Web framework
- **feedparser 6.0.12+**: RSS/Atom feed parsing
- **SQLAlchemy 2.0.0+**: ORM for database operations
- **Alembic 1.13.0+**: Database migration tool
- **Python 3.14+**: Required runtime version