Back to Skills
    🦞

    asr

    Fast, accurate, and incredibly inexpensive automatic speech-to-text.

    By @ilyakam
    View on GitHub
    SKILL.md
    # Speech is Cheap (SIC) Skill
    
    Fast, accurate, and incredibly inexpensive automatic speech-to-text transcription service.
    
    ## 🚀 Why use this skill?
    - **Disruptive Pricing:** $0.06 - $0.12 per hour (2-15x cheaper than Deepgram or OpenAI).
    - **Extreme Speed:** 100 minutes of audio transcribes in ~1 minute.
    - **Multilingual:** Supports 100 languages with auto-detection.
    - **Agent-Ready:** Designed for high-volume, automated pipelines.
    
    ## 🛠 Setup
    
    ### 1. Get an API Key
    Sign up at [speechischeap.com](https://speechischeap.com). Use code `CH5` for $5 off.
    
    ### 2. Configure Authentication
    This skill looks for your API key in the `SIC_API_KEY` environment variable.
    
    Add this to your `.env` or agent config:
    ```bash
    SIC_API_KEY=your_key_here
    ```
    
    ## 📖 Usage
    ### 🤖 TL;DR for Agents
    When this skill is installed, you can transcribe any URL from an OpenClaw session and get the JSON results immediately by running:
    `./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3"`
    
    ### Transcribe a URL
    ```bash
    # Basic transcription
    ./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3"
    
    # Advanced transcription with options
    ./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3" \
      --speakers --words --labels \
      --language "en" \
      --format "srt" \
      --private
    ```
    
    ### Transcribe a Local File
    Perfect for processing audio already on your disk. This handles the upload automatically.
    ```bash
    # Upload and transcribe local media
    ./skills/asr/scripts/asr.sh transcribe --file "./local-audio.wav"
    
    # Upload with webhook callback
    ./skills/asr/scripts/asr.sh transcribe --file "./local-audio.wav" --webhook "https://mysite.com/callback"
    
    # Note: For local files, the skill handles the multi-part upload to
    # https://upload.speechischeap.com before starting the transcription.
    ```
    
    ### Supported Options
    - `--speakers`: Enable speaker diarization
    - `--words`: Enable word-level timestamps
    - `--labels`: Enable audio labeling (music, noise, etc.)
    - `--stream`: Enable streaming output
    - `--private`: Do not store audio/transcript (privacy mode)
    - `--language <code>`: ISO language code (e.g., 'en', 'es')
    - `--confidence <float>`: Minimum confidence threshold (default 0.5)
    - `--format <fmt>`: Output format (json, srt, vtt, webvtt)
    - `--webhook <url>`: URL to receive job completion payload
    - `--segment-duration <n>`: Segment duration in seconds (default 30)
    
    ### Check Job Status
    ```bash
    ./skills/asr/scripts/asr.sh status "job-id-here"
    ```
    
    ## 🤖 For Agents
    The `asr.sh` command-line tool returns JSON by default when successful, making it easy to pipe into other tools or parse directly.
    
    If the `SIC_API_KEY` is missing, the tool will provide a clear error message and a direct link to the signup page.