diff --git a/.env.example b/.env.example new file mode 100644 index 0000000..b38ce18 --- /dev/null +++ b/.env.example @@ -0,0 +1,14 @@ +# Database Configuration +DATABASE_URL=postgresql://yottob:yottob_password@postgres:5432/yottob + +# Celery Configuration +CELERY_BROKER_URL=redis://redis:6379/0 +CELERY_RESULT_BACKEND=redis://redis:6379/0 + +# Flask Configuration +FLASK_ENV=development + +# PostgreSQL Configuration (for docker-compose) +POSTGRES_USER=yottob +POSTGRES_PASSWORD=yottob_password +POSTGRES_DB=yottob diff --git a/.gitignore b/.gitignore index 3ae621d..87e888b 100644 --- a/.gitignore +++ b/.gitignore @@ -17,3 +17,9 @@ wheels/ # Downloaded videos downloads/ + +# Environment variables +.env + +# Docker +.dockerignore diff --git a/CLAUDE.md b/CLAUDE.md index dbef353..86425f4 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,9 +4,45 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co ## Project Overview -`yottob` is a Flask-based web application for processing YouTube RSS feeds with SQLAlchemy ORM persistence and async video downloads. The project provides both a REST API and CLI interface for fetching and parsing YouTube channel feeds, with filtering logic to exclude YouTube Shorts. All fetched feeds are automatically saved to a SQLite database for historical tracking. Videos can be downloaded asynchronously as MP4 files using Celery workers and yt-dlp. +`yottob` is a Flask-based web application for processing YouTube RSS feeds with SQLAlchemy ORM persistence and async video downloads. The project provides both a REST API and CLI interface for fetching and parsing YouTube channel feeds, with filtering logic to exclude YouTube Shorts. All fetched feeds are automatically saved to a PostgreSQL database for historical tracking. Videos can be downloaded asynchronously as MP4 files using Celery workers and yt-dlp. -## Development Setup +The application is containerized with Docker and uses docker-compose to orchestrate multiple services: PostgreSQL, Redis, Flask web app, and Celery worker. + +## Quick Start with Docker Compose (Recommended) + +**Prerequisites:** +- Docker and Docker Compose installed +- No additional dependencies needed + +**Start all services:** +```bash +# Copy environment variables template +cp .env.example .env + +# Start all services (postgres, redis, app, celery) +docker-compose up -d + +# View logs +docker-compose logs -f + +# Stop all services +docker-compose down + +# Stop and remove volumes (deletes database data) +docker-compose down -v +``` + +**Run database migrations (first time setup or after model changes):** +```bash +docker-compose exec app alembic upgrade head +``` + +**Access the application:** +- Web API: http://localhost:5000 +- PostgreSQL: localhost:5432 +- Redis: localhost:6379 + +## Development Setup (Local Without Docker) This project uses `uv` for dependency management. @@ -20,13 +56,25 @@ uv sync source .venv/bin/activate # On macOS/Linux ``` -**Initialize/update database:** +**Set up environment variables:** ```bash -# Run migrations to create or update database schema -source .venv/bin/activate && alembic upgrade head +cp .env.example .env +# Edit .env with your local configuration ``` -**Start Redis (required for Celery):** +**Start PostgreSQL (choose one):** +```bash +# Using Docker +docker run -d -p 5432:5432 \ + -e POSTGRES_USER=yottob \ + -e POSTGRES_PASSWORD=yottob_password \ + -e POSTGRES_DB=yottob \ + postgres:16-alpine + +# Or use existing PostgreSQL installation +``` + +**Start Redis:** ```bash # macOS with Homebrew brew services start redis @@ -36,9 +84,11 @@ sudo systemctl start redis # Docker docker run -d -p 6379:6379 redis:alpine +``` -# Verify Redis is running -redis-cli ping # Should return "PONG" +**Initialize/update database:** +```bash +source .venv/bin/activate && alembic upgrade head ``` **Start Celery worker (required for video downloads):** @@ -48,14 +98,17 @@ source .venv/bin/activate && celery -A celery_app worker --loglevel=info ## Running the Application -**Run the CLI feed parser:** +**With Docker Compose:** ```bash -python main.py +docker-compose up ``` -This executes the `main()` function which fetches and parses a YouTube channel RSS feed for testing. -**Run the Flask web application:** +**Local development:** ```bash +# Run the CLI feed parser +python main.py + +# Run the Flask web application flask --app main run ``` The web server exposes: @@ -110,7 +163,7 @@ The codebase follows a clean layered architecture with separation of concerns: - Relationships: One Channel has many VideoEntry records **`database.py`** - Database configuration and session management -- `DATABASE_URL`: SQLite database location (yottob.db) +- `DATABASE_URL`: Database URL from environment variable (PostgreSQL in production, SQLite fallback for local dev) - `engine`: SQLAlchemy engine instance - `init_db()`: Creates all tables - `get_db_session()`: Context manager for database sessions @@ -246,11 +299,35 @@ The application uses Celery with Redis for asynchronous video downloads: - Celery worker must be running to process downloads - FFmpeg recommended for format conversion (yt-dlp will use it if available) +## Environment Variables + +All environment variables can be configured in `.env` file (see `.env.example` for template): + +- `DATABASE_URL`: PostgreSQL connection string (default: `sqlite:///yottob.db` for local dev) +- `CELERY_BROKER_URL`: Redis URL for Celery broker (default: `redis://localhost:6379/0`) +- `CELERY_RESULT_BACKEND`: Redis URL for Celery results (default: `redis://localhost:6379/0`) +- `FLASK_ENV`: Flask environment (development or production) +- `POSTGRES_USER`: PostgreSQL username (for docker-compose) +- `POSTGRES_PASSWORD`: PostgreSQL password (for docker-compose) +- `POSTGRES_DB`: PostgreSQL database name (for docker-compose) + +## Docker Compose Services + +The application consists of 4 services defined in `docker-compose.yml`: + +1. **postgres**: PostgreSQL 16 database with persistent volume +2. **redis**: Redis 7 message broker for Celery +3. **app**: Flask web application (exposed on port 5000) +4. **celery**: Celery worker for async video downloads + +All services have health checks and automatic restarts configured. + ## Dependencies - **Flask 3.1.2+**: Web framework - **feedparser 6.0.12+**: RSS/Atom feed parsing - **SQLAlchemy 2.0.0+**: ORM for database operations +- **psycopg2-binary 2.9.0+**: PostgreSQL database driver - **Alembic 1.13.0+**: Database migration tool - **Celery 5.3.0+**: Distributed task queue for async jobs - **Redis 5.0.0+**: Message broker for Celery diff --git a/Dockerfile b/Dockerfile new file mode 100644 index 0000000..54d60d1 --- /dev/null +++ b/Dockerfile @@ -0,0 +1,33 @@ +FROM python:3.14-slim + +# Install system dependencies +RUN apt-get update && apt-get install -y \ + ffmpeg \ + postgresql-client \ + curl \ + && rm -rf /var/lib/apt/lists/* + +# Install uv for faster Python package management +RUN curl -LsSf https://astral.sh/uv/install.sh | sh +ENV PATH="/root/.cargo/bin:$PATH" + +# Set working directory +WORKDIR /app + +# Copy dependency files +COPY pyproject.toml uv.lock ./ + +# Install Python dependencies +RUN uv sync --frozen + +# Copy application code +COPY . . + +# Create downloads directory +RUN mkdir -p downloads + +# Expose Flask port +EXPOSE 5000 + +# Default command (can be overridden in docker-compose) +CMD ["flask", "--app", "main", "run", "--host=0.0.0.0"] diff --git a/celery_app.py b/celery_app.py index 985dcc8..69fcb2b 100644 --- a/celery_app.py +++ b/celery_app.py @@ -1,12 +1,17 @@ """Celery application configuration.""" +import os from celery import Celery +# Get configuration from environment variables +CELERY_BROKER_URL = os.getenv("CELERY_BROKER_URL", "redis://localhost:6379/0") +CELERY_RESULT_BACKEND = os.getenv("CELERY_RESULT_BACKEND", "redis://localhost:6379/0") + # Configure Celery celery_app = Celery( "yottob", - broker="redis://localhost:6379/0", - backend="redis://localhost:6379/0", + broker=CELERY_BROKER_URL, + backend=CELERY_RESULT_BACKEND, include=["download_service"] ) diff --git a/database.py b/database.py index 3129d61..33665f9 100644 --- a/database.py +++ b/database.py @@ -1,5 +1,6 @@ """Database configuration and session management.""" +import os from contextlib import contextmanager from typing import Generator @@ -9,15 +10,20 @@ from sqlalchemy.orm import sessionmaker, Session from models import Base -# Database configuration -DATABASE_URL = "sqlite:///yottob.db" +# Database configuration from environment variable +# Falls back to SQLite for local development if DATABASE_URL not set +DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///yottob.db") -# Create engine -engine = create_engine( - DATABASE_URL, - echo=False, # Set to True for SQL query logging - connect_args={"check_same_thread": False} # Needed for SQLite -) +# Create engine with appropriate configuration +engine_kwargs = { + "echo": False, # Set to True for SQL query logging +} + +# SQLite-specific configuration +if DATABASE_URL.startswith("sqlite"): + engine_kwargs["connect_args"] = {"check_same_thread": False} + +engine = create_engine(DATABASE_URL, **engine_kwargs) # Session factory SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine) diff --git a/docker-compose.yml b/docker-compose.yml new file mode 100644 index 0000000..b5add2b --- /dev/null +++ b/docker-compose.yml @@ -0,0 +1,72 @@ +version: '3.8' + +services: + postgres: + image: postgres:16-alpine + container_name: yottob-postgres + environment: + POSTGRES_USER: ${POSTGRES_USER:-yottob} + POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-yottob_password} + POSTGRES_DB: ${POSTGRES_DB:-yottob} + ports: + - "5432:5432" + volumes: + - postgres_data:/var/lib/postgresql/data + healthcheck: + test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-yottob}"] + interval: 5s + timeout: 5s + retries: 5 + + redis: + image: redis:7-alpine + container_name: yottob-redis + ports: + - "6379:6379" + healthcheck: + test: ["CMD", "redis-cli", "ping"] + interval: 5s + timeout: 3s + retries: 5 + + app: + build: . + container_name: yottob-app + command: flask --app main run --host=0.0.0.0 --port=5000 + environment: + DATABASE_URL: postgresql://${POSTGRES_USER:-yottob}:${POSTGRES_PASSWORD:-yottob_password}@postgres:5432/${POSTGRES_DB:-yottob} + CELERY_BROKER_URL: redis://redis:6379/0 + CELERY_RESULT_BACKEND: redis://redis:6379/0 + FLASK_ENV: ${FLASK_ENV:-development} + ports: + - "5000:5000" + volumes: + - ./downloads:/app/downloads + - ./:/app + depends_on: + postgres: + condition: service_healthy + redis: + condition: service_healthy + restart: unless-stopped + + celery: + build: . + container_name: yottob-celery + command: celery -A celery_app worker --loglevel=info + environment: + DATABASE_URL: postgresql://${POSTGRES_USER:-yottob}:${POSTGRES_PASSWORD:-yottob_password}@postgres:5432/${POSTGRES_DB:-yottob} + CELERY_BROKER_URL: redis://redis:6379/0 + CELERY_RESULT_BACKEND: redis://redis:6379/0 + volumes: + - ./downloads:/app/downloads + - ./:/app + depends_on: + postgres: + condition: service_healthy + redis: + condition: service_healthy + restart: unless-stopped + +volumes: + postgres_data: diff --git a/pyproject.toml b/pyproject.toml index 546956e..fe73982 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -9,6 +9,7 @@ dependencies = [ "celery>=5.3.0", "feedparser>=6.0.12", "flask>=3.1.2", + "psycopg2-binary>=2.9.0", "redis>=5.0.0", "sqlalchemy>=2.0.0", "yt-dlp>=2024.0.0", diff --git a/uv.lock b/uv.lock index 3642d41..6207f19 100644 --- a/uv.lock +++ b/uv.lock @@ -268,6 +268,25 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/84/03/0d3ce49e2505ae70cf43bc5bb3033955d2fc9f932163e84dc0779cc47f48/prompt_toolkit-3.0.52-py3-none-any.whl", hash = "sha256:9aac639a3bbd33284347de5ad8d68ecc044b91a762dc39b7c21095fcd6a19955", size = 391431, upload-time = "2025-08-27T15:23:59.498Z" }, ] +[[package]] +name = "psycopg2-binary" +version = "2.9.11" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/ac/6c/8767aaa597ba424643dc87348c6f1754dd9f48e80fdc1b9f7ca5c3a7c213/psycopg2-binary-2.9.11.tar.gz", hash = "sha256:b6aed9e096bf63f9e75edf2581aa9a7e7186d97ab5c177aa6c87797cd591236c", size = 379620, upload-time = "2025-10-10T11:14:48.041Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/64/12/93ef0098590cf51d9732b4f139533732565704f45bdc1ffa741b7c95fb54/psycopg2_binary-2.9.11-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:92e3b669236327083a2e33ccfa0d320dd01b9803b3e14dd986a4fc54aa00f4e1", size = 3756567, upload-time = "2025-10-10T11:13:11.885Z" }, + { url = "https://files.pythonhosted.org/packages/7c/a9/9d55c614a891288f15ca4b5209b09f0f01e3124056924e17b81b9fa054cc/psycopg2_binary-2.9.11-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:e0deeb03da539fa3577fcb0b3f2554a97f7e5477c246098dbb18091a4a01c16f", size = 3864755, upload-time = "2025-10-10T11:13:17.727Z" }, + { url = "https://files.pythonhosted.org/packages/13/1e/98874ce72fd29cbde93209977b196a2edae03f8490d1bd8158e7f1daf3a0/psycopg2_binary-2.9.11-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:9b52a3f9bb540a3e4ec0f6ba6d31339727b2950c9772850d6545b7eae0b9d7c5", size = 4411646, upload-time = "2025-10-10T11:13:24.432Z" }, + { url = "https://files.pythonhosted.org/packages/5a/bd/a335ce6645334fb8d758cc358810defca14a1d19ffbc8a10bd38a2328565/psycopg2_binary-2.9.11-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:db4fd476874ccfdbb630a54426964959e58da4c61c9feba73e6094d51303d7d8", size = 4468701, upload-time = "2025-10-10T11:13:29.266Z" }, + { url = "https://files.pythonhosted.org/packages/44/d6/c8b4f53f34e295e45709b7568bf9b9407a612ea30387d35eb9fa84f269b4/psycopg2_binary-2.9.11-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:47f212c1d3be608a12937cc131bd85502954398aaa1320cb4c14421a0ffccf4c", size = 4166293, upload-time = "2025-10-10T11:13:33.336Z" }, + { url = "https://files.pythonhosted.org/packages/4b/e0/f8cc36eadd1b716ab36bb290618a3292e009867e5c97ce4aba908cb99644/psycopg2_binary-2.9.11-cp314-cp314-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:e35b7abae2b0adab776add56111df1735ccc71406e56203515e228a8dc07089f", size = 3983184, upload-time = "2025-10-30T02:55:32.483Z" }, + { url = "https://files.pythonhosted.org/packages/53/3e/2a8fe18a4e61cfb3417da67b6318e12691772c0696d79434184a511906dc/psycopg2_binary-2.9.11-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:fcf21be3ce5f5659daefd2b3b3b6e4727b028221ddc94e6c1523425579664747", size = 3652650, upload-time = "2025-10-10T11:13:38.181Z" }, + { url = "https://files.pythonhosted.org/packages/76/36/03801461b31b29fe58d228c24388f999fe814dfc302856e0d17f97d7c54d/psycopg2_binary-2.9.11-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:9bd81e64e8de111237737b29d68039b9c813bdf520156af36d26819c9a979e5f", size = 3298663, upload-time = "2025-10-10T11:13:44.878Z" }, + { url = "https://files.pythonhosted.org/packages/97/77/21b0ea2e1a73aa5fa9222b2a6b8ba325c43c3a8d54272839c991f2345656/psycopg2_binary-2.9.11-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:32770a4d666fbdafab017086655bcddab791d7cb260a16679cc5a7338b64343b", size = 3044737, upload-time = "2025-10-30T02:55:35.69Z" }, + { url = "https://files.pythonhosted.org/packages/67/69/f36abe5f118c1dca6d3726ceae164b9356985805480731ac6712a63f24f0/psycopg2_binary-2.9.11-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:c3cb3a676873d7506825221045bd70e0427c905b9c8ee8d6acd70cfcbd6e576d", size = 3347643, upload-time = "2025-10-10T11:13:53.499Z" }, + { url = "https://files.pythonhosted.org/packages/e1/36/9c0c326fe3a4227953dfb29f5d0c8ae3b8eb8c1cd2967aa569f50cb3c61f/psycopg2_binary-2.9.11-cp314-cp314-win_amd64.whl", hash = "sha256:4012c9c954dfaccd28f94e84ab9f94e12df76b4afb22331b1f0d3154893a6316", size = 2803913, upload-time = "2025-10-10T11:13:57.058Z" }, +] + [[package]] name = "python-dateutil" version = "2.9.0.post0" @@ -374,6 +393,7 @@ dependencies = [ { name = "celery" }, { name = "feedparser" }, { name = "flask" }, + { name = "psycopg2-binary" }, { name = "redis" }, { name = "sqlalchemy" }, { name = "yt-dlp" }, @@ -385,6 +405,7 @@ requires-dist = [ { name = "celery", specifier = ">=5.3.0" }, { name = "feedparser", specifier = ">=6.0.12" }, { name = "flask", specifier = ">=3.1.2" }, + { name = "psycopg2-binary", specifier = ">=2.9.0" }, { name = "redis", specifier = ">=5.0.0" }, { name = "sqlalchemy", specifier = ">=2.0.0" }, { name = "yt-dlp", specifier = ">=2024.0.0" },