youtube-transcript
💡 Summary
Downloads transcripts from YouTube videos using yt-dlp, with fallback to Whisper transcription when subtitles are unavailable.
🎯 Target Audience
🤖 AI Roast: “It's a well-documented wrapper for yt-dlp that still requires you to babysit its external dependencies like a sysadmin.”
Risks: Executes arbitrary shell commands from untrusted README, installs packages via pip/apt/brew (supply chain risk), downloads and processes arbitrary user-provided URLs. Mitigation: Run in a sandboxed container; use fixed, verified versions of yt-dlp and whisper; validate YouTube URLs before processing.
name: youtube-transcript description: Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video. allowed-tools: Bash,Read,Write
YouTube Transcript Downloader
This skill helps download transcripts (subtitles/captions) from YouTube videos using yt-dlp.
When to Use This Skill
Activate this skill when the user:
- Provides a YouTube URL and wants the transcript
- Asks to "download transcript from YouTube"
- Wants to "get captions" or "get subtitles" from a video
- Asks to "transcribe a YouTube video"
- Needs text content from a YouTube video
How It Works
Priority Order:
- Check if yt-dlp is installed - install if needed
- List available subtitles - see what's actually available
- Try manual subtitles first (
--write-sub) - highest quality - Fallback to auto-generated (
--write-auto-sub) - usually available - Last resort: Whisper transcription - if no subtitles exist (requires user confirmation)
- Confirm the download and show the user where the file is saved
- Optionally clean up the VTT format if the user wants plain text
Installation Check
IMPORTANT: Always check if yt-dlp is installed first:
which yt-dlp || command -v yt-dlp
If Not Installed
Attempt automatic installation based on the system:
macOS (Homebrew):
brew install yt-dlp
Linux (apt/Debian/Ubuntu):
sudo apt update && sudo apt install -y yt-dlp
Alternative (pip - works on all systems):
pip3 install yt-dlp # or python3 -m pip install yt-dlp
If installation fails: Inform the user they need to install yt-dlp manually and provide them with installation instructions from https://github.com/yt-dlp/yt-dlp#installation
Check Available Subtitles
ALWAYS do this first before attempting to download:
yt-dlp --list-subs "YOUTUBE_URL"
This shows what subtitle types are available without downloading anything. Look for:
- Manual subtitles (better quality)
- Auto-generated subtitles (usually available)
- Available languages
Download Strategy
Option 1: Manual Subtitles (Preferred)
Try this first - highest quality, human-created:
yt-dlp --write-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"
Option 2: Auto-Generated Subtitles (Fallback)
If manual subtitles aren't available:
yt-dlp --write-auto-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"
Both commands create a .vtt file (WebVTT subtitle format).
Option 3: Whisper Transcription (Last Resort)
ONLY use this if both manual and auto-generated subtitles are unavailable.
Step 1: Show File Size and Ask for Confirmation
# Get audio file size estimate yt-dlp --print "%(filesize,filesize_approx)s" -f "bestaudio" "YOUTUBE_URL" # Or get duration to estimate yt-dlp --print "%(duration)s %(title)s" "YOUTUBE_URL"
IMPORTANT: Display the file size to the user and ask: "No subtitles are available. I can download the audio (approximately X MB) and transcribe it using Whisper. Would you like to proceed?"
Wait for user confirmation before continuing.
Step 2: Check for Whisper Installation
command -v whisper
If not installed, ask user: "Whisper is not installed. Install it with pip install openai-whisper (requires ~1-3GB for models)? This is a one-time installation."
Wait for user confirmation before installing.
Install if approved:
pip3 install openai-whisper
Step 3: Download Audio Only
yt-dlp -x --audio-format mp3 --output "audio_%(id)s.%(ext)s" "YOUTUBE_URL"
Step 4: Transcribe with Whisper
# Auto-detect language (recommended) whisper audio_VIDEO_ID.mp3 --model base --output_format vtt # Or specify language if known whisper audio_VIDEO_ID.mp3 --model base --language en --output_format vtt
Model Options (stick to base for now):
tiny- fastest, least accurate (~1GB)base- good balance (~1GB) ← USE THISsmall- better accuracy (~2GB)medium- very good (~5GB)large- best accuracy (~10GB)
Step 5: Cleanup
After transcription completes, ask user: "Transcription complete! Would you like me to delete the audio file to save space?"
If yes:
rm audio_VIDEO_ID.mp3
Getting Video Information
Extract Video Title (for filename)
yt-dlp --print "%(title)s" "YOUTUBE_URL"
Use this to create meaningful filenames based on the video title. Clean the title for filesystem compatibility:
- Replace
/with- - Replace special characters that might cause issues
- Consider using sanitized version:
$(yt-dlp --print "%(title)s" "URL" | tr '/' '-' | tr ':' '-')
Post-Processing
Convert to Plain Text (Recommended)
YouTube's auto-generated VTT files contain duplicate lines because captions are shown progressively with overlapping timestamps. Always deduplicate when converting to plain text while preserving the original speaking order.
python3 -c " import sys, re seen = set() with open('transcript.en.vtt', 'r') as f: for line in f: line = line.strip() if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line: clean = re.sub('<[^>]*>', '', line) clean = clean.replace('&', '&').replace('>', '>').replace('<', '<') if clean and clean not in seen: print(clean) seen.add(clean) " > transcript.txt
Complete Post-Processing with Video Title
# Get video title VIDEO_TITLE=$(yt-dlp --print "%(title)s" "YOUTUBE_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '') # Find the VTT file VTT_FILE=$(ls *.vtt | head -n 1) # Convert with deduplication python3 -c " import sys, re seen = set() with open('$VTT_FILE', 'r') as f: for line in f: line = line.strip() if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line: clean = re.sub('<[^>]*>', '', line) clean = clean.replace('&', '&').replace('>', '>').replace('<', '<') if clean and clean not in seen: print(clean) seen.add(clean) " > "${VIDEO_TITLE}.txt" echo "✓ Saved to: ${VIDEO_TITLE}.txt" # Clean up VTT file rm "$VTT_FILE" echo "✓ Cleaned up temporary VTT file"
Output Formats
- VTT format (
.vtt): Includes timestamps and formatting, good for video players - Plain text (
.txt): Just the text content, good for reading or analysis
Tips
- The filename will be
{output_name}.{language_code}.vtt(e.g.,transcript.en.vtt) - Most YouTube videos have auto-generated English subtitles
- Some videos may have multiple language options
- If auto-subtitles aren't available, try
--write-subinstead for manual subtitles
Complete Workflow Example
VIDEO_URL="https://www.youtube.com/watch?v=dQw4w9WgXcQ" # Get video title for filename VIDEO_TITLE=$(yt-dlp --print "%(title)s" "$VIDEO_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '') OUTPUT_NAME="transcript_temp" # ============================================ # STEP 1: Check if yt-dlp is installed # ============================================ if ! command -v yt-dlp &> /dev/null; then echo "yt-dlp not found, attempting to install..." if command -v brew &> /dev/null; then brew install yt-dlp elif command -v apt &> /dev/null; then sudo apt update && sudo apt install -y yt-dlp else pip3 install yt-dlp fi fi # ============================================ # STEP 2: List available subtitles # ============================================ echo "Checking available subtitles..." yt-dlp --list-subs "$VIDEO_URL" # ============================================ # STEP 3: Try manual subtitles first # ============================================ echo "Attempting to download manual subtitles..." if yt-dlp --write-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then echo "✓ Manual subtitles downloaded successfully!" ls -lh ${OUTPUT_NAME}.* else # ============================================ # STEP 4: Fallback to auto-generated # ============================================ echo "Manual subtitles not available. Trying auto-generated..." if yt-dlp --write-auto-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then echo "✓ Auto-generated subtitles downloaded successfully!" ls -lh ${OUTPUT_NAME}.* else # ============================================ # STEP 5: Last resort - Whisper transcription # ============================================ echo "⚠ No subtitles available for this video." # Get file size FILE_SIZE=$(yt-dlp --print "%(filesize_approx)s" -f "bestaudio" "$VIDEO_URL") DURATION=$(yt-dlp --print "%(duration)s" "$VIDEO_URL") TITLE=$(yt-dlp --print "%(title)s" "$VIDEO_URL") echo "Video: $TITLE" echo "Duration: $((DURATION / 60)) minutes" echo "Audio size: ~$((FILE_SIZE / 1024 / 1024)) MB" echo "" echo "Would you like to download and transcribe with Whisper? (y/n)" read -r RESPONSE if [[ "$RESPONSE" =~ ^[Yy]$ ]]; then # Check for Whisper if ! command -v whisper &> /dev/null; then echo "Whisper not installed. Install now? (requires ~1-3GB) (y/n)" read -r INSTALL_RESPONSE if [[ "$INSTALL_RESPONSE" =~ ^[Yy]$ ]]; then pip3 install openai-whisper else echo "Cannot proceed without Whisper. Exiting." exit 1 fi
Pros
- Comprehensive multi-fallback strategy (manual -> auto -> Whisper)
- Includes post-processing to clean VTT format into plain text
- Provides clear user confirmation steps for large downloads/installs
Cons
- Relies heavily on external tools (yt-dlp, Whisper) with complex installation paths
- Whisper fallback requires significant disk space and user patience
- Bash script logic can be fragile across different user environments
Related Skills
pytorch
S“It's the Swiss Army knife of deep learning, but good luck figuring out which of the 47 installation methods is the one that won't break your system.”
agno
S“It promises to be the Kubernetes for agents, but let's see if developers have the patience to learn yet another orchestration layer.”
nuxt-skills
S“It's essentially a well-organized cheat sheet that turns your AI assistant into a Nuxt framework parrot.”
Disclaimer: This content is sourced from GitHub open source projects for display and rating purposes only.
Copyright belongs to the original author michalparkola.
