Files

Ryan Chen 4892bec986 Add SQLAlchemy ORM with Alembic migrations

- Added SQLAlchemy 2.0 and Alembic 1.13 dependencies
- Created models.py with Channel and VideoEntry ORM models
- Created database.py for database configuration and session management
- Initialized Alembic migration system with initial migration
- Updated feed_parser.py with save_to_db() method for persistence
- Updated main.py with database initialization and new API routes:
  - /api/feed now saves to database by default
  - /api/channels lists all tracked channels
  - /api/history/<channel_id> returns video history
- Updated .gitignore to exclude database files
- Updated CLAUDE.md with comprehensive ORM and migration documentation

Database uses SQLite (yottob.db) with upsert logic to avoid duplicates.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-26 13:58:10 -05:00

5.3 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

yottob is a Flask-based web application for processing YouTube RSS feeds with SQLAlchemy ORM persistence. The project provides both a REST API and CLI interface for fetching and parsing YouTube channel feeds, with filtering logic to exclude YouTube Shorts. All fetched feeds are automatically saved to a SQLite database for historical tracking.

Development Setup

This project uses uv for dependency management.

Install dependencies:

uv sync

Activate virtual environment:

source .venv/bin/activate  # On macOS/Linux

Initialize/update database:

# Run migrations to create or update database schema
source .venv/bin/activate && alembic upgrade head

Running the Application

Run the CLI feed parser:

python main.py

This executes the main() function which fetches and parses a YouTube channel RSS feed for testing.

Run the Flask web application:

flask --app main run

The web server exposes:

/ - Main page (renders index.html)
/api/feed - API endpoint for fetching feeds and saving to database
/api/channels - List all tracked channels
/api/history/<channel_id> - Get video history for a specific channel

API Usage Examples:

# Fetch default channel feed (automatically saves to DB)
curl http://localhost:5000/api/feed

# Fetch specific channel with options
curl "http://localhost:5000/api/feed?channel_id=CHANNEL_ID&filter_shorts=false&save=true"

# List all tracked channels
curl http://localhost:5000/api/channels

# Get video history for a channel (limit 20 videos)
curl "http://localhost:5000/api/history/CHANNEL_ID?limit=20"

Architecture

The codebase follows a clean layered architecture with separation of concerns:

Database Layer

models.py - SQLAlchemy ORM models

Base: Declarative base for all models
Channel: Stores YouTube channel metadata (channel_id, title, link, last_fetched)
VideoEntry: Stores individual video entries with foreign key to Channel
Relationships: One Channel has many VideoEntry records

database.py - Database configuration and session management

DATABASE_URL: SQLite database location (yottob.db)
engine: SQLAlchemy engine instance
init_db(): Creates all tables
get_db_session(): Context manager for database sessions

Core Logic Layer

feed_parser.py - Reusable YouTube feed parsing module

YouTubeFeedParser: Main parser class that encapsulates channel-specific logic
FeedEntry: In-memory data model for feed entries
fetch_feed(): Fetches and parses RSS feeds
save_to_db(): Persists feed data to database with upsert logic
Independent of Flask - can be imported and used in any Python context

Web Server Layer

main.py - Flask application and routes

app: Flask application instance (main.py:9)
Database initialization on startup (main.py:16)
index(): Homepage route handler (main.py:20)
get_feed(): REST API endpoint (main.py:26) that fetches and saves to DB
get_channels(): Lists all tracked channels (main.py:59)
get_history(): Returns video history for a channel (main.py:86)
main(): CLI entry point for testing (main.py:132)

Templates

templates/index.html - Frontend HTML (currently static placeholder)

Feed Parsing Implementation

The YouTubeFeedParser class in feed_parser.py:

Constructs YouTube RSS feed URLs from channel IDs
Uses feedparser to fetch and parse feeds
Validates HTTP 200 status before processing
Optionally filters out YouTube Shorts (any entry with "shorts" in URL)
Returns structured dictionary with feed metadata and entries

YouTube RSS Feed URL Format:

https://www.youtube.com/feeds/videos.xml?channel_id={CHANNEL_ID}

Database Migrations

This project uses Alembic for database schema migrations.

Create a new migration after model changes:

source .venv/bin/activate && alembic revision --autogenerate -m "Description of changes"

Apply migrations:

source .venv/bin/activate && alembic upgrade head

View migration history:

source .venv/bin/activate && alembic history

Rollback to previous version:

source .venv/bin/activate && alembic downgrade -1

Migration files location: alembic/versions/

Important notes:

Always review auto-generated migrations before applying
The database is automatically initialized on Flask app startup via init_db()
Migration configuration is in alembic.ini and alembic/env.py
Models are imported in alembic/env.py for autogenerate support

Database Schema

channels table:

id: Primary key
channel_id: YouTube channel ID (unique, indexed)
title: Channel title
link: Channel URL
last_fetched: Timestamp of last feed fetch

video_entries table:

id: Primary key
channel_id: Foreign key to channels.id
title: Video title
link: Video URL (unique)
created_at: Timestamp when video was first recorded
Index: idx_channel_created on (channel_id, created_at) for fast queries

Dependencies

Flask 3.1.2+: Web framework
feedparser 6.0.12+: RSS/Atom feed parsing
SQLAlchemy 2.0.0+: ORM for database operations
Alembic 1.13.0+: Database migration tool
Python 3.14+: Required runtime version

5.3 KiB Raw Blame History