🦞
tldw

**too long; didn't watch**
SKILL.md
# tldw - YouTube Video Summarizer

**too long; didn't watch**

Extract and summarize YouTube video transcripts quickly and efficiently.

## Overview

The tldw skill takes a YouTube URL, extracts the video transcript, and provides a comprehensive summary of the content. This allows you to quickly understand video content without watching the entire video.

### Purpose

This skill solves the problem of information overload from video content. Instead of spending 10-60 minutes watching a video, you can get the key points, main arguments, and conclusions in a concise summary within seconds.

### When to Use

Use this skill when:
- A user provides a YouTube video URL and asks for a summary
- You need to quickly understand video content without watching it
- You want to analyze or reference specific video content
- You need to extract information from educational, news, or documentary videos

### How It Works

1. **Extraction**: Uses yt-dlp to download video transcripts (captions/subtitles)
2. **Cleaning**: Applies deduplication to remove artifacts from auto-generated captions
3. **Processing**: Analyzes the cleaned transcript directly in the main agent session
4. **Summary**: Returns a structured summary with main points, key arguments, and conclusions

### Key Features

- **Caching**: Downloaded transcripts are cached locally to avoid re-downloading
- **Deduplication**: Removes duplicate lines common in auto-generated captions
- **Multi-format support**: Works with VTT, SRT, and JSON caption formats
- **Cookie support**: Can access age-restricted content with a cookie file
- **Comprehensive summaries**: Provides thesis, key examples, comparisons, and conclusions
- **Fast processing**: Typical videos summarized in seconds

### Attribution

This skill is based on the [tldw project](https://github.com/rmusser01/tldw) by stong. Full attribution and licensing details are available in [ATTRIBUTION.md](ATTRIBUTION.md).

---

## Requirements

### System Requirements

- **Python**: 3.8 or higher
- **Disk space**: ~60MB for virtual environment and dependencies, plus additional space for transcript cache

### Required Dependencies

The skill uses a Python virtual environment with the following dependencies:

- **yt-dlp**: Video transcript downloader (installed via pip)
- **Python standard library**: json, re, argparse (built-in)

All dependencies are installed in the local virtual environment at `venv/`.

### Optional Dependencies

- **Cookie file**: For accessing age-restricted or members-only content
  - Format: Netscape cookie format (can be exported from browser)
  - Place in skill directory and reference with `--cookies` flag

### Directory Structure

```
tldw/
├── SKILL.md              # This documentation
├── ATTRIBUTION.md        # Credit to original project
├── LICENSE               # AGPL-3.0 license
├── scripts/
│   └── extract_transcript.py   # Main extraction script
├── cache/                # Cached transcripts (auto-created)
└── venv/                 # Python virtual environment
    ├── bin/
    │   └── yt-dlp        # Video transcript downloader
    └── lib/              # Python packages
```

### Setup

Follow these steps to set up the tldw skill:

1. **Navigate to the skill directory:**
   ```bash
   cd tldw/
   ```

2. **Create a Python virtual environment:**
   ```bash
   python3 -m venv venv
   ```

3. **Install dependencies:**
   ```bash
   venv/bin/pip install yt-dlp webvtt-py
   ```

4. **Verify installation:**
   ```bash
   venv/bin/yt-dlp --version
   ```

5. **The cache directory will be created automatically on first use**

The skill is now ready to use!

---

## Usage Instructions

### Basic Usage

When a user provides a YouTube URL and requests a summary, use the following workflow:

1. Extract the transcript using the extraction script
2. Parse the JSON output to get the cleaned transcript
3. Summarize the transcript directly (do not use sub-agents for large transcripts)
4. Return a structured summary to the user

### Command Syntax

```bash
cd tldw/ && \
venv/bin/python scripts/extract_transcript.py \
  --json --cache-dir cache "YOUTUBE_URL"
```

### Processing the Output

The script returns JSON with the following structure:

```json
{
  "transcript": "Full cleaned transcript text...",
  "video_id": "video_id_here",
  "title": "Video Title",
  "description": "Video description...",
  "duration": 1234,
  "uploader": "Channel Name",
  "upload_date": "20260101",
  "view_count": 12345,
  "webpage_url": "https://www.youtube.com/watch?v=..."
}
```

Extract the `transcript` field and process it directly to create a comprehensive summary.

### Command Options

- `--json`: Output in JSON format (recommended for parsing)
- `--cache-dir <path>`: Specify cache directory (default: `cache/`)
- `--cookies <file>`: Path to Netscape-format cookie file for age-restricted content

### Example Workflow

```bash
# 1. Extract transcript
cd tldw/ && \
venv/bin/python scripts/extract_transcript.py \
  --json --cache-dir cache "https://www.youtube.com/watch?v=VIDEO_ID"

# 2. Parse the JSON output and extract the transcript field

# 3. Summarize the transcript directly (include main points, key arguments, conclusions)

# 4. Return formatted summary to user
```

### Accessing Age-Restricted Content

For age-restricted or members-only videos, export cookies from your browser:

1. Install a browser extension like "Get cookies.txt LOCALLY"
2. Navigate to YouTube while logged in
3. Export cookies in Netscape format
4. Save to the tldw directory (e.g., `youtube_cookies.txt`)
5. Use with: `--cookies youtube_cookies.txt`

---

## Error Handling

### No Captions Available

**Error message:** `"No subtitles/captions found"`

**What it means:** The video has no auto-generated or manual captions available.

**Solution:** Inform the user that the video cannot be transcribed because it lacks captions.

### Invalid URL

**Error message:** `"ERROR: unable to download video data"`

**What it means:** The URL is malformed, the video doesn't exist, or it's private/deleted.

**Solution:** Verify the URL is correct and check if the video is publicly accessible.

### Age-Restricted Content

**Error message:** `"Sign in to confirm your age"` or similar authentication errors

**What it means:** The video requires age verification or YouTube login.

**Solution:** Use the `--cookies` flag with exported browser cookies (see "Accessing Age-Restricted Content" above).

### Network/Connection Errors

**Error messages:** `"Unable to download"`, `"Connection timeout"`, extraction failures

**What it means:** Network issues, YouTube blocking the request, or **outdated yt-dlp** that's incompatible with current YouTube.

**Solution:**
1. **First, update yt-dlp:** 
   ```bash
   cd tldw/ && \
   venv/bin/pip install --upgrade yt-dlp
   ```
2. Retry the extraction
3. If still failing: check internet connection or wait and try later

YouTube frequently changes their API, so keeping yt-dlp updated is essential.

### Cache Issues

**Symptoms:** Permission errors, disk full errors

**What it means:** The cache directory has permission problems or insufficient disk space.

**Solution:** Check available disk space with `df -h` and verify write permissions on the `cache/` directory.

### Large Transcript Handling

**Note:** Transcripts over 50,000 characters may take longer to process.

**Best practice:** Process large transcripts directly in the main agent session. Do not delegate to sub-agents, as they have been found unreliable with large payloads.

### Debugging

To see full error output (not just the last 100 lines):

```bash
cd tldw/ && \
venv/bin/python scripts/extract_transcript.py \
  --json --cache-dir cache "YOUTUBE_URL"
```

To inspect cached transcripts:

```bash
ls -lh tldw/cache/
```

---

## Limitations

### Caption Dependency

- The skill **only works with videos that have captions/subtitles** available
- Cannot transcribe videos with only audio (no built-in speech-to-text capability)
- Auto-generated captions may contain errors, typos, or timing artifacts
- Deduplication helps clean up auto-generated caption issues but isn't perfect

### Language Support

- Depends on available caption languages provided by YouTube
- The script extracts whatever captions are available (auto-generated or manual)
- Non-English captions work, but summarization quality depends on the language model's capabilities
- English captions typically provide the best results

### Video Length

- Very long videos (2+ hours) may produce extremely large transcripts (100k+ characters)
- Processing time increases proportionally with transcript length
- No hard limit, but practical considerations for context window and processing time apply

### YouTube-Only

- Currently **only supports YouTube URLs**
- Does not work with other video platforms (Vimeo, Dailymotion, TikTok, etc.)
- While yt-dlp supports many platforms, this script is optimized specifically for YouTube

### Private/Restricted Content

- Cannot access truly private videos (not shared publicly)
- Members-only or channel membership content requires cookies from an authenticated session
- Live streams may not have captions available until after the stream has ended
- Some geographic restrictions cannot be bypassed even with cookies

### Deduplication Limitations

- The deduplication logic removes **consecutive duplicate lines**
- May occasionally remove legitimate repeated phrases or refrains
- Designed primarily for auto-generated caption artifacts, not all repetition scenarios
- Manual captions typically don't need deduplication

### No Audio Extraction

- This skill extracts **text transcripts only**, not audio files
- For audio extraction or processing, other tools (like yt-dlp directly with audio flags) would be needed
- Focus is on text-based content analysis, not media files

---