Back to Skills
    šŸ¦ž

    tldw

    **too long; didn't watch**

    By @vovavvk
    View on GitHub
    SKILL.md
    # tldw - YouTube Video Summarizer
    
    **too long; didn't watch**
    
    Extract and summarize YouTube video transcripts quickly and efficiently.
    
    ## Overview
    
    The tldw skill takes a YouTube URL, extracts the video transcript, and provides a comprehensive summary of the content. This allows you to quickly understand video content without watching the entire video.
    
    ### Purpose
    
    This skill solves the problem of information overload from video content. Instead of spending 10-60 minutes watching a video, you can get the key points, main arguments, and conclusions in a concise summary within seconds.
    
    ### When to Use
    
    Use this skill when:
    - A user provides a YouTube video URL and asks for a summary
    - You need to quickly understand video content without watching it
    - You want to analyze or reference specific video content
    - You need to extract information from educational, news, or documentary videos
    
    ### How It Works
    
    1. **Extraction**: Uses yt-dlp to download video transcripts (captions/subtitles)
    2. **Cleaning**: Applies deduplication to remove artifacts from auto-generated captions
    3. **Processing**: Analyzes the cleaned transcript directly in the main agent session
    4. **Summary**: Returns a structured summary with main points, key arguments, and conclusions
    
    ### Key Features
    
    - **Caching**: Downloaded transcripts are cached locally to avoid re-downloading
    - **Deduplication**: Removes duplicate lines common in auto-generated captions
    - **Multi-format support**: Works with VTT, SRT, and JSON caption formats
    - **Cookie support**: Can access age-restricted content with a cookie file
    - **Comprehensive summaries**: Provides thesis, key examples, comparisons, and conclusions
    - **Fast processing**: Typical videos summarized in seconds
    
    ### Attribution
    
    This skill is based on the [tldw project](https://github.com/rmusser01/tldw) by stong. Full attribution and licensing details are available in [ATTRIBUTION.md](ATTRIBUTION.md).
    
    ---
    
    ## Requirements
    
    ### System Requirements
    
    - **Python**: 3.8 or higher
    - **Disk space**: ~60MB for virtual environment and dependencies, plus additional space for transcript cache
    
    ### Required Dependencies
    
    The skill uses a Python virtual environment with the following dependencies:
    
    - **yt-dlp**: Video transcript downloader (installed via pip)
    - **Python standard library**: json, re, argparse (built-in)
    
    All dependencies are installed in the local virtual environment at `venv/`.
    
    ### Optional Dependencies
    
    - **Cookie file**: For accessing age-restricted or members-only content
      - Format: Netscape cookie format (can be exported from browser)
      - Place in skill directory and reference with `--cookies` flag
    
    ### Directory Structure
    
    ```
    tldw/
    ā”œā”€ā”€ SKILL.md              # This documentation
    ā”œā”€ā”€ ATTRIBUTION.md        # Credit to original project
    ā”œā”€ā”€ LICENSE               # AGPL-3.0 license
    ā”œā”€ā”€ scripts/
    │   └── extract_transcript.py   # Main extraction script
    ā”œā”€ā”€ cache/                # Cached transcripts (auto-created)
    └── venv/                 # Python virtual environment
        ā”œā”€ā”€ bin/
        │   └── yt-dlp        # Video transcript downloader
        └── lib/              # Python packages
    ```
    
    ### Setup
    
    Follow these steps to set up the tldw skill:
    
    1. **Navigate to the skill directory:**
       ```bash
       cd tldw/
       ```
    
    2. **Create a Python virtual environment:**
       ```bash
       python3 -m venv venv
       ```
    
    3. **Install dependencies:**
       ```bash
       venv/bin/pip install yt-dlp webvtt-py
       ```
    
    4. **Verify installation:**
       ```bash
       venv/bin/yt-dlp --version
       ```
    
    5. **The cache directory will be created automatically on first use**
    
    The skill is now ready to use!
    
    ---
    
    ## Usage Instructions
    
    ### Basic Usage
    
    When a user provides a YouTube URL and requests a summary, use the following workflow:
    
    1. Extract the transcript using the extraction script
    2. Parse the JSON output to get the cleaned transcript
    3. Summarize the transcript directly (do not use sub-agents for large transcripts)
    4. Return a structured summary to the user
    
    ### Command Syntax
    
    ```bash
    cd tldw/ && \
    venv/bin/python scripts/extract_transcript.py \
      --json --cache-dir cache "YOUTUBE_URL"
    ```
    
    ### Processing the Output
    
    The script returns JSON with the following structure:
    
    ```json
    {
      "transcript": "Full cleaned transcript text...",
      "video_id": "video_id_here",
      "title": "Video Title",
      "description": "Video description...",
      "duration": 1234,
      "uploader": "Channel Name",
      "upload_date": "20260101",
      "view_count": 12345,
      "webpage_url": "https://www.youtube.com/watch?v=..."
    }
    ```
    
    Extract the `transcript` field and process it directly to create a comprehensive summary.
    
    ### Command Options
    
    - `--json`: Output in JSON format (recommended for parsing)
    - `--cache-dir <path>`: Specify cache directory (default: `cache/`)
    - `--cookies <file>`: Path to Netscape-format cookie file for age-restricted content
    
    ### Example Workflow
    
    ```bash
    # 1. Extract transcript
    cd tldw/ && \
    venv/bin/python scripts/extract_transcript.py \
      --json --cache-dir cache "https://www.youtube.com/watch?v=VIDEO_ID"
    
    # 2. Parse the JSON output and extract the transcript field
    
    # 3. Summarize the transcript directly (include main points, key arguments, conclusions)
    
    # 4. Return formatted summary to user
    ```
    
    ### Accessing Age-Restricted Content
    
    For age-restricted or members-only videos, export cookies from your browser:
    
    1. Install a browser extension like "Get cookies.txt LOCALLY"
    2. Navigate to YouTube while logged in
    3. Export cookies in Netscape format
    4. Save to the tldw directory (e.g., `youtube_cookies.txt`)
    5. Use with: `--cookies youtube_cookies.txt`
    
    ---
    
    ## Error Handling
    
    ### No Captions Available
    
    **Error message:** `"No subtitles/captions found"`
    
    **What it means:** The video has no auto-generated or manual captions available.
    
    **Solution:** Inform the user that the video cannot be transcribed because it lacks captions.
    
    ### Invalid URL
    
    **Error message:** `"ERROR: unable to download video data"`
    
    **What it means:** The URL is malformed, the video doesn't exist, or it's private/deleted.
    
    **Solution:** Verify the URL is correct and check if the video is publicly accessible.
    
    ### Age-Restricted Content
    
    **Error message:** `"Sign in to confirm your age"` or similar authentication errors
    
    **What it means:** The video requires age verification or YouTube login.
    
    **Solution:** Use the `--cookies` flag with exported browser cookies (see "Accessing Age-Restricted Content" above).
    
    ### Network/Connection Errors
    
    **Error messages:** `"Unable to download"`, `"Connection timeout"`, extraction failures
    
    **What it means:** Network issues, YouTube blocking the request, or **outdated yt-dlp** that's incompatible with current YouTube.
    
    **Solution:**
    1. **First, update yt-dlp:** 
       ```bash
       cd tldw/ && \
       venv/bin/pip install --upgrade yt-dlp
       ```
    2. Retry the extraction
    3. If still failing: check internet connection or wait and try later
    
    YouTube frequently changes their API, so keeping yt-dlp updated is essential.
    
    ### Cache Issues
    
    **Symptoms:** Permission errors, disk full errors
    
    **What it means:** The cache directory has permission problems or insufficient disk space.
    
    **Solution:** Check available disk space with `df -h` and verify write permissions on the `cache/` directory.
    
    ### Large Transcript Handling
    
    **Note:** Transcripts over 50,000 characters may take longer to process.
    
    **Best practice:** Process large transcripts directly in the main agent session. Do not delegate to sub-agents, as they have been found unreliable with large payloads.
    
    ### Debugging
    
    To see full error output (not just the last 100 lines):
    
    ```bash
    cd tldw/ && \
    venv/bin/python scripts/extract_transcript.py \
      --json --cache-dir cache "YOUTUBE_URL"
    ```
    
    To inspect cached transcripts:
    
    ```bash
    ls -lh tldw/cache/
    ```
    
    ---
    
    ## Limitations
    
    ### Caption Dependency
    
    - The skill **only works with videos that have captions/subtitles** available
    - Cannot transcribe videos with only audio (no built-in speech-to-text capability)
    - Auto-generated captions may contain errors, typos, or timing artifacts
    - Deduplication helps clean up auto-generated caption issues but isn't perfect
    
    ### Language Support
    
    - Depends on available caption languages provided by YouTube
    - The script extracts whatever captions are available (auto-generated or manual)
    - Non-English captions work, but summarization quality depends on the language model's capabilities
    - English captions typically provide the best results
    
    ### Video Length
    
    - Very long videos (2+ hours) may produce extremely large transcripts (100k+ characters)
    - Processing time increases proportionally with transcript length
    - No hard limit, but practical considerations for context window and processing time apply
    
    ### YouTube-Only
    
    - Currently **only supports YouTube URLs**
    - Does not work with other video platforms (Vimeo, Dailymotion, TikTok, etc.)
    - While yt-dlp supports many platforms, this script is optimized specifically for YouTube
    
    ### Private/Restricted Content
    
    - Cannot access truly private videos (not shared publicly)
    - Members-only or channel membership content requires cookies from an authenticated session
    - Live streams may not have captions available until after the stream has ended
    - Some geographic restrictions cannot be bypassed even with cookies
    
    ### Deduplication Limitations
    
    - The deduplication logic removes **consecutive duplicate lines**
    - May occasionally remove legitimate repeated phrases or refrains
    - Designed primarily for auto-generated caption artifacts, not all repetition scenarios
    - Manual captions typically don't need deduplication
    
    ### No Audio Extraction
    
    - This skill extracts **text transcripts only**, not audio files
    - For audio extraction or processing, other tools (like yt-dlp directly with audio flags) would be needed
    - Focus is on text-based content analysis, not media files
    
    ---