- Added User Authentication section with security features - Updated Frontend Interface section to note authentication requirement - Updated Database Schema with users table and new channel/video fields - Updated Database Layer architecture with User model details - Updated Dependencies to include Flask-Login and bcrypt - Documented user data isolation and multi-tenant architecture - Added first-time setup instructions for registration Documentation now reflects: - Complete authentication system - User-scoped data model - Enhanced video metadata fields - Security best practices 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
16 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
yottob is a Flask-based web application for processing YouTube RSS feeds with SQLAlchemy ORM persistence and async video downloads. The project provides both a REST API and CLI interface for fetching and parsing YouTube channel feeds, with filtering logic to exclude YouTube Shorts. All fetched feeds are automatically saved to a PostgreSQL database for historical tracking. Videos can be downloaded asynchronously as MP4 files using Celery workers and yt-dlp.
The application is containerized with Docker and uses docker-compose to orchestrate multiple services: PostgreSQL, Redis, Flask web app, and Celery worker.
Quick Start with Docker Compose (Recommended)
Prerequisites:
- Docker and Docker Compose installed
- No additional dependencies needed
Start all services:
# Copy environment variables template
cp .env.example .env
# Start all services (postgres, redis, app, celery)
docker-compose up -d
# View logs
docker-compose logs -f
# Stop all services
docker-compose down
# Stop and remove volumes (deletes database data)
docker-compose down -v
Run database migrations (first time setup or after model changes):
docker-compose exec app alembic upgrade head
Access the application:
- Web API: http://localhost:5000
- PostgreSQL: localhost:5432
- Redis: localhost:6379
Development Setup (Local Without Docker)
This project uses uv for dependency management.
Install dependencies:
uv sync
Activate virtual environment:
source .venv/bin/activate # On macOS/Linux
Set up environment variables:
cp .env.example .env
# Edit .env with your local configuration
Start PostgreSQL (choose one):
# Using Docker
docker run -d -p 5432:5432 \
-e POSTGRES_USER=yottob \
-e POSTGRES_PASSWORD=yottob_password \
-e POSTGRES_DB=yottob \
postgres:16-alpine
# Or use existing PostgreSQL installation
Start Redis:
# macOS with Homebrew
brew services start redis
# Linux
sudo systemctl start redis
# Docker
docker run -d -p 6379:6379 redis:alpine
Initialize/update database:
source .venv/bin/activate && alembic upgrade head
Start Celery worker (required for video downloads):
source .venv/bin/activate && celery -A celery_app worker --loglevel=info
Running the Application
With Docker Compose:
docker-compose up
Local development:
# Run the CLI feed parser
python main.py
# Run the Flask web application
flask --app main run
User Authentication
The application includes a complete user authentication system using Flask-Login and bcrypt:
Authentication Pages:
/register- User registration with email and password/login- User login with "remember me" functionality/logout- User logout
Security Features:
- Passwords hashed with bcrypt and salt
- Session-based authentication via Flask-Login
- Protected routes with
@login_requireddecorator - User-specific data isolation (multi-tenant architecture)
- Secure password requirements (minimum 8 characters)
- Flash messages for all auth actions
- Redirect to requested page after login
First Time Setup:
- Start the application
- Navigate to http://localhost:5000/register
- Create an account
- Login and start subscribing to channels
User Data Isolation:
- Each user can only see their own channels and videos
- Channels are scoped by user_id
- All routes filter data by current_user.id
- Users cannot access other users' content
Frontend Interface
The application includes a full-featured web interface built with Jinja2 templates:
Pages: (all require authentication)
/- Dashboard showing user's videos sorted by date (newest first)/channels- User's channel management page with refresh functionality/add-channel- Form to subscribe to new YouTube channels/watch/<video_id>- Video player page for watching downloaded videos
Features:
- User registration and login system
- Video grid with thumbnails and metadata
- Real-time download status indicators (pending, downloading, completed, failed)
- Inline video downloads from dashboard
- HTML5 video player for streaming downloaded videos
- Channel subscription and management
- Refresh individual channels to fetch new videos
- Responsive design for mobile and desktop
- User-specific navigation showing username
API Endpoints:
/api/feed- Fetch YouTube channel feed and save to database (GET)/api/channels- List all tracked channels (GET)/api/history/<channel_id>- Get video history for a specific channel (GET)/api/download/<video_id>- Trigger video download (POST)/api/download/status/<video_id>- Check download status (GET)/api/download/batch- Batch download multiple videos (POST)/api/videos/refresh/<channel_id>- Refresh videos for a channel (POST)/api/video/stream/<video_id>- Stream or download video file (GET)
API Usage Examples:
# Fetch default channel feed (automatically saves to DB)
curl http://localhost:5000/api/feed
# Fetch specific channel with options
curl "http://localhost:5000/api/feed?channel_id=CHANNEL_ID&filter_shorts=false&save=true"
# List all tracked channels
curl http://localhost:5000/api/channels
# Get video history for a channel (limit 20 videos)
curl "http://localhost:5000/api/history/CHANNEL_ID?limit=20"
# Trigger download for a specific video
curl -X POST http://localhost:5000/api/download/123
# Check download status
curl http://localhost:5000/api/download/status/123
# Batch download all pending videos for a channel
curl -X POST "http://localhost:5000/api/download/batch?channel_id=CHANNEL_ID&status=pending"
# Batch download specific video IDs
curl -X POST http://localhost:5000/api/download/batch \
-H "Content-Type: application/json" \
-d '{"video_ids": [1, 2, 3, 4, 5]}'
Architecture
The codebase follows a clean layered architecture with separation of concerns:
Database Layer
models.py - SQLAlchemy ORM models
Base: Declarative base for all modelsUser: User model with Flask-Login integration- Stores username, email, password_hash, created_at
- Methods:
set_password(),check_password()for bcrypt password handling - Implements UserMixin for Flask-Login compatibility
DownloadStatus: Enum for download states (pending, downloading, completed, failed)Channel: Stores YouTube channel metadata per user- Fields: user_id, channel_id, title, link, rss_url, last_fetched_at
- Unique constraint: (user_id, channel_id) - one user can't subscribe twice
VideoEntry: Stores individual video entries with full metadata- Fields: video_id, title, video_url, thumbnail_url, description, published_at
- Download tracking: download_status, download_path, download_started_at, download_completed_at, download_error, file_size
- Unique constraint: (video_id, channel_id) - prevents duplicate videos
- Relationships:
- One User has many Channels
- One Channel has many VideoEntries
database.py - Database configuration and session management
DATABASE_URL: Database URL from environment variable (PostgreSQL in production, SQLite fallback for local dev)engine: SQLAlchemy engine instanceinit_db(): Creates all tablesget_db_session(): Context manager for database sessions
Async Task Queue Layer
celery_app.py - Celery configuration
- Celery instance configured with Redis broker
- Task serialization and worker configuration
- 1-hour task timeout with automatic retries
download_service.py - Video download tasks
download_video(video_id): Celery task to download a single video as MP4- Uses yt-dlp with MP4 format preference
- Updates database with download progress and status
- Automatic retry on failure (max 3 attempts)
download_videos_batch(video_ids): Queue multiple downloads- Downloads saved to
downloads/directory
Core Logic Layer
feed_parser.py - Reusable YouTube feed parsing module
YouTubeFeedParser: Main parser class that encapsulates channel-specific logicFeedEntry: In-memory data model for feed entriesfetch_feed(): Fetches and parses RSS feedssave_to_db(): Persists feed data to database with upsert logic- Independent of Flask - can be imported and used in any Python context
Web Server Layer
main.py - Flask application and routes
Frontend Routes:
index(): Dashboard page with all videos sorted by date (main.py:24)channels_page(): Channel management page (main.py:40)add_channel_page(): Add channel form and subscription handler (main.py:52)watch_video(): Video player page (main.py:94)
API Routes:
get_feed(): Fetch YouTube feed and save to database (main.py:110)get_channels(): List all tracked channels (main.py:145)get_history(): Video history for a channel (main.py:172)trigger_download(): Queue video download task (main.py:216)get_download_status(): Check download status (main.py:258)trigger_batch_download(): Queue multiple downloads (main.py:290)refresh_channel_videos(): Refresh videos for a channel (main.py:347)stream_video(): Stream or download video file (main.py:391)
Frontend Templates
templates/base.html - Base template with navigation and common layout
- Navigation bar with logo and menu
- Flash message display system
- Common styles and responsive design
templates/dashboard.html - Main video listing page
- Video grid sorted by published date (newest first)
- Thumbnail display with download status badges
- Inline download buttons for pending videos
- Empty state for new installations
templates/channels.html - Channel management interface
- List of subscribed channels with metadata
- Refresh button to fetch new videos per channel
- Link to add new channels
- Video count and last updated timestamps
templates/add_channel.html - Channel subscription form
- Form to input YouTube RSS feed URL
- Help section with instructions on finding RSS URLs
- Examples and format guidance
templates/watch.html - Video player page
- HTML5 video player for downloaded videos
- Download status placeholders (downloading, failed, pending)
- Video metadata (title, channel, publish date)
- Download button for pending videos
- Auto-refresh when video is downloading
static/style.css - Application styles
- Dark theme inspired by YouTube
- Responsive grid layout
- Video card components
- Form styling
- Badge and button components
Feed Parsing Implementation
The YouTubeFeedParser class in feed_parser.py:
- Constructs YouTube RSS feed URLs from channel IDs
- Uses feedparser to fetch and parse feeds
- Validates HTTP 200 status before processing
- Optionally filters out YouTube Shorts (any entry with "shorts" in URL)
- Returns structured dictionary with feed metadata and entries
YouTube RSS Feed URL Format:
https://www.youtube.com/feeds/videos.xml?channel_id={CHANNEL_ID}
Database Migrations
This project uses Alembic for database schema migrations.
Create a new migration after model changes:
source .venv/bin/activate && alembic revision --autogenerate -m "Description of changes"
Apply migrations:
source .venv/bin/activate && alembic upgrade head
View migration history:
source .venv/bin/activate && alembic history
Rollback to previous version:
source .venv/bin/activate && alembic downgrade -1
Migration files location: alembic/versions/
Important notes:
- Always review auto-generated migrations before applying
- The database is automatically initialized on Flask app startup via
init_db() - Migration configuration is in
alembic.iniandalembic/env.py - Models are imported in
alembic/env.pyfor autogenerate support
Database Schema
users table:
id: Primary keyusername: Unique username (indexed)email: Unique email address (indexed)password_hash: Bcrypt-hashed passwordcreated_at: Timestamp when user registered
channels table:
id: Primary keyuser_id: Foreign key to users.id (indexed)channel_id: YouTube channel ID (indexed)title: Channel titlelink: Channel URLrss_url: YouTube RSS feed URLlast_fetched_at: Timestamp of last feed fetch- Unique index:
idx_user_channelon (user_id, channel_id) - prevents duplicate subscriptions
video_entries table:
id: Primary keychannel_id: Foreign key to channels.idvideo_id: YouTube video ID (indexed)title: Video titlevideo_url: YouTube video URL (indexed)thumbnail_url: Video thumbnail URLdescription: Video descriptionpublished_at: When video was published on YouTube (indexed)created_at: Timestamp when video was first recordeddownload_status: Enum (pending, downloading, completed, failed)download_path: Local file path to downloaded MP4download_started_at: When download begandownload_completed_at: When download finisheddownload_error: Error message if download failedfile_size: Size in bytes of downloaded file- Unique index:
idx_video_id_channelon (video_id, channel_id) - prevents duplicates - Index:
idx_channel_createdon (channel_id, created_at) for fast queries - Index:
idx_download_statuson download_status for filtering - Index:
idx_published_aton published_at for date sorting
Video Download System
The application uses Celery with Redis for asynchronous video downloads:
Download Workflow:
- User triggers download via
/api/download/<video_id>(POST) - VideoEntry status changes to "downloading"
- Celery worker picks up task and uses yt-dlp to download as MP4
- Progress updates written to database
- On completion, status changes to "completed" with file path
- On failure, status changes to "failed" with error message (auto-retry 3x)
yt-dlp Configuration:
- Format:
bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best - Output format: MP4 (converted if necessary using FFmpeg)
- Output location:
downloads/<video_id>_<title>.mp4 - Progress hooks for real-time status updates
Requirements:
- Redis server must be running (localhost:6379)
- Celery worker must be running to process downloads
- FFmpeg recommended for format conversion (yt-dlp will use it if available)
Environment Variables
All environment variables can be configured in .env file (see .env.example for template):
DATABASE_URL: PostgreSQL connection string (default:sqlite:///yottob.dbfor local dev)CELERY_BROKER_URL: Redis URL for Celery broker (default:redis://localhost:6379/0)CELERY_RESULT_BACKEND: Redis URL for Celery results (default:redis://localhost:6379/0)FLASK_ENV: Flask environment (development or production)POSTGRES_USER: PostgreSQL username (for docker-compose)POSTGRES_PASSWORD: PostgreSQL password (for docker-compose)POSTGRES_DB: PostgreSQL database name (for docker-compose)
Docker Compose Services
The application consists of 4 services defined in docker-compose.yml:
- postgres: PostgreSQL 16 database with persistent volume
- redis: Redis 7 message broker for Celery
- app: Flask web application (exposed on port 5000)
- celery: Celery worker for async video downloads
All services have health checks and automatic restarts configured.
Dependencies
- Flask 3.1.2+: Web framework
- Flask-Login 0.6.0+: User session management
- bcrypt 4.0.0+: Password hashing
- feedparser 6.0.12+: RSS/Atom feed parsing
- SQLAlchemy 2.0.0+: ORM for database operations
- psycopg2-binary 2.9.0+: PostgreSQL database driver
- Alembic 1.13.0+: Database migration tool
- Celery 5.3.0+: Distributed task queue for async jobs
- Redis 5.0.0+: Message broker for Celery
- yt-dlp 2024.0.0+: YouTube video downloader
- Python 3.14+: Required runtime version