AI YouTube Shorts Generator

AI-powered tool to automatically generate engaging YouTube Shorts from long-form videos. Uses GPT-4o-mini and Whisper to extract highlights, add subtitles, and crop videos vertically for social media.

Features

🎬 Flexible Input: Supports both YouTube URLs and local video files
🎤 GPU-Accelerated Transcription: CUDA-enabled Whisper for fast speech-to-text
🤖 AI Highlight Selection: GPT-5-nano automatically finds the most engaging 2-minute segments
✅ Interactive Approval: Review and approve/regenerate selections with 15-second auto-approve timeout
📝 Auto Subtitles: Stylized captions with Franklin Gothic font burned into video
🎯 Smart Cropping:
- Face videos: Static face-centered crop (no jerky movement)
- Screen recordings: Half-width display with smooth motion tracking (1 shift/second max)
📱 Vertical Format: Perfect 9:16 aspect ratio for TikTok/YouTube Shorts/Instagram Reels
⚙️ Automation Ready: CLI arguments, auto-quality selection, timeout-based approvals
🔄 Concurrent Execution: Unique session IDs allow multiple instances to run simultaneously
📦 Clean Output: Slugified filenames (e.g., my-video-title_short.mp4) and automatic temp file cleanup

Installation

Prerequisites

Python 3.10+
FFmpeg with development headers
NVIDIA GPU with CUDA support (optional, but recommended for faster transcription)
ImageMagick (for subtitle rendering)
OpenAI API key

Steps

Clone the repository:

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator

Install system dependencies:

sudo apt install -y ffmpeg libavdevice-dev libavfilter-dev libopus-dev \
  libvpx-dev pkg-config libsrtp2-dev imagemagick

Fix ImageMagick security policy (required for subtitles):

sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Create and activate virtual environment:

python3.10 -m venv venv
source venv/bin/activate

Install Python dependencies:
```
pip install -r requirements.txt
```
Set up environment variables:

Create a .env file in the project root:
```
OPENAI_API=your_openai_api_key_here
```

Usage

With YouTube URL (Interactive)

./run.sh
# Then enter YouTube URL when prompted
# You'll be able to select video resolution (5s timeout, auto-selects highest)

With YouTube URL (Command-Line)

./run.sh "https://youtu.be/VIDEO_ID"

With Local Video File

./run.sh "/path/to/your/video.mp4"

Batch Processing Multiple URLs

Create a urls.txt file with one URL per line, then:

# Process all URLs sequentially with auto-approve
xargs -a urls.txt -I{} ./run.sh --auto-approve {}

Or without auto-approve (will prompt for each):

xargs -a urls.txt -I{} ./run.sh {}

Resolution Selection

When downloading from YouTube, you'll see:

Available video streams:
  0. Resolution: 1080p, Size: 45.2 MB, Type: Adaptive
  1. Resolution: 720p, Size: 28.1 MB, Type: Adaptive
  2. Resolution: 480p, Size: 15.3 MB, Type: Adaptive

Select resolution number (0-2) or wait 5s for auto-select...
Auto-selecting highest quality in 5 seconds...

Enter a number to select that resolution immediately
Wait 5 seconds to auto-select highest quality (1080p)
Invalid input falls back to highest quality

How It Works

Download/Load: Fetches from YouTube or loads local file
Resolution Selection: Choose video quality (5s timeout, auto-selects highest)
Extract Audio: Converts to WAV format
Transcribe: GPU-accelerated Whisper transcription (~30s for 5min video)
AI Analysis: GPT-4o-mini selects most engaging 2-minute segment
Interactive Approval: Review selection, regenerate if needed, or auto-approve in 15s
Extract Clip: Crops selected timeframe
Smart Crop:
- Detects faces → static face-centered vertical crop
- No faces → half-width screen recording with motion tracking
Add Subtitles: Burns Franklin Gothic captions with blue text/black outline
Combine Audio: Merges audio track with final video
Cleanup: Removes all temporary files

Output: {video-title}_{session-id}_short.mp4 with slugified filename and unique identifier

Interactive Workflow

After AI selects a highlight, you'll see:

============================================================
SELECTED SEGMENT DETAILS:
Time: 68s - 187s (119s duration)
============================================================

Options:
  [Enter/y] Approve and continue
  [r] Regenerate selection
  [n] Cancel

Auto-approving in 15 seconds if no input...

Press Enter or y to approve
Press r to regenerate a different selection (can repeat multiple times)
Press n to cancel
Wait 15 seconds to auto-approve (perfect for automation)

Configuration

Subtitle Styling

Edit Components/Subtitles.py:

Font: Line 51 (font='Franklin-Gothic')
Size: Line 47 (fontsize=80)
Color: Line 48 (color='#2699ff')
Outline: Lines 49-50 (stroke_color='black', stroke_width=2)

Highlight Selection Criteria

Edit Components/LanguageTasks.py:

Prompt: Line 29 (adjust what's "interesting, useful, surprising, controversial, or thought-provoking")
Model: Line 54 (model="gpt-4o-mini")
Temperature: Line 55 (temperature=1.0)

Motion Tracking

Edit Components/FaceCrop.py:

Update frequency: Line 93 (update_interval = int(fps)) - currently 1 shift/second
Smoothing: Line 115 (0.90 * smoothed_x + 0.10 * target_x) - currently 90%/10%
Motion threshold: Line 107 (motion_threshold = 2.0)

Face Detection

Edit Components/FaceCrop.py:

Sensitivity: Line 37 (minNeighbors=8) - Higher = fewer false positives
Minimum size: Line 37 (minSize=(30, 30)) - Minimum face size in pixels

Video Quality

Edit Components/Subtitles.py and Components/FaceCrop.py:

Bitrate: Subtitles.py line 74 (bitrate='3000k')
Preset: Subtitles.py line 73 (preset='medium')

Output Files

Final videos are named: {video-title}_{session-id}_short.mp4

Example: my-awesome-video_a1b2c3d4_short.mp4

Slugified title: Lowercase, hyphens instead of spaces
Session ID: 8-character unique identifier for traceability
Resolution: Matches source video height (720p → 404x720, 1080p → 607x1080)

Concurrent Execution

Run multiple instances simultaneously:

./run.sh "https://youtu.be/VIDEO1" &
./run.sh "https://youtu.be/VIDEO2" &
./run.sh "/path/to/video3.mp4" &

Each instance gets a unique session ID and temporary files, preventing conflicts.

Troubleshooting

CUDA/GPU Issues

# Verify CUDA libraries
export LD_LIBRARY_PATH=$(find $(pwd)/venv/lib/python3.10/site-packages/nvidia -name "lib" -type d | paste -sd ":" -)

The run.sh script handles this automatically.

No Subtitles

Ensure ImageMagick policy allows file operations:

grep 'pattern="@\*"' /etc/ImageMagick-6/policy.xml
# Should show: rights="read|write"

Face Detection Issues

Video needs visible faces in first 30 frames
For screen recordings, automatic motion tracking applies
Low-resolution videos may have less reliable detection

Contributing

Contributions are welcome! Please fork the repository and submit a pull request.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
Components		Components
demos		demos
models		models
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
TextOverlay.md		TextOverlay.md
docker-compose.yml		docker-compose.yml
haarcascade_frontalface_default.xml		haarcascade_frontalface_default.xml
main.py		main.py
requirements.txt		requirements.txt
run.sh		run.sh
test_fix.py		test_fix.py
verify_cuda.py		verify_cuda.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI YouTube Shorts Generator

Features

Installation

Prerequisites

Steps

Usage

With YouTube URL (Interactive)

With YouTube URL (Command-Line)

With Local Video File

Batch Processing Multiple URLs

Resolution Selection

How It Works

Interactive Workflow

Configuration

Subtitle Styling

Highlight Selection Criteria

Motion Tracking

Face Detection

Video Quality

Output Files

Concurrent Execution

Troubleshooting

CUDA/GPU Issues

No Subtitles

Face Detection Issues

Contributing

License

Related Projects

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 8

Languages

License

SamurAIGPT/AI-Youtube-Shorts-Generator

Folders and files

Latest commit

History

Repository files navigation

AI YouTube Shorts Generator

Features

Installation

Prerequisites

Steps

Usage

With YouTube URL (Interactive)

With YouTube URL (Command-Line)

With Local Video File

Batch Processing Multiple URLs

Resolution Selection

How It Works

Interactive Workflow

Configuration

Subtitle Styling

Highlight Selection Criteria

Motion Tracking

Face Detection

Video Quality

Output Files

Concurrent Execution

Troubleshooting

CUDA/GPU Issues

No Subtitles

Face Detection Issues

Contributing

License

Related Projects

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 8

Languages

Packages