🦞
audiopod

Use AudioPod AI's API for audio processing tasks including AI music
SKILL.md
---
name: audiopod
description: Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to-text transcription, speaker separation, and media extraction. Use when the user needs to generate music/songs/rap from text, split a song into stems/vocals/instruments, generate speech from text, clean up noisy audio, transcribe audio/video, or extract audio from YouTube/URLs. Requires AUDIOPOD_API_KEY env var or pass api_key directly.
---

# AudioPod AI

Full audio processing API: music generation, stem separation, TTS, noise reduction, transcription, speaker separation, wallet management.

## Setup

```bash
pip install audiopod  # Python
npm install audiopod  # Node.js
```

Auth: set `AUDIOPOD_API_KEY` env var or pass to client constructor.

### Getting an API Key
1. Sign up at https://audiopod.ai/auth/signup (free, no credit card required)
2. Go to https://www.audiopod.ai/dashboard/account/api-keys
3. Click "Create API Key" and copy the key (starts with `ap_`)
4. Add funds to your wallet at https://www.audiopod.ai/dashboard/account/wallet (pay-as-you-go, no subscription)

```python
from audiopod import AudioPod
client = AudioPod()  # uses AUDIOPOD_API_KEY env var
# or: client = AudioPod(api_key="ap_...")
```

---

## AI Music Generation

Generate songs, rap, instrumentals, samples, and vocals from text prompts.

**Tasks:** `text2music` (song with vocals), `text2rap` (rap), `prompt2instrumental` (instrumental), `lyric2vocals` (vocals only), `text2samples` (loops/samples), `audio2audio` (style transfer), `songbloom`

### Python SDK

```python
# Generate a full song with lyrics
result = client.music.song(
    prompt="Upbeat pop, synth, drums, 120 bpm, female vocals, radio-ready",
    lyrics="Verse 1:\nWalking down the street on a sunny day\n\nChorus:\nWe're on fire tonight!",
    duration=60
)
print(result["output_url"])

# Generate rap
result = client.music.rap(
    prompt="Lo-Fi Hip Hop, 100 BPM, male rap, melancholy, keyboard chords",
    lyrics="Verse 1:\nStarted from the bottom, now we climbing...",
    duration=60
)

# Generate instrumental (no lyrics needed)
result = client.music.instrumental(
    prompt="Atmospheric ambient soundscape, uplifting, driving mood",
    duration=30
)

# Generic generate with explicit task
result = client.music.generate(
    prompt="Electronic dance music, high energy",
    task="text2samples",  # any task type
    duration=30
)

# Async: submit then poll
job = client.music.create(
    prompt="Chill lofi beat", 
    duration=30, 
    task="prompt2instrumental"
)
result = client.music.wait_for_completion(job["id"], timeout=600)

# Get available genre presets
presets = client.music.get_presets()

# List/manage jobs
jobs = client.music.list(skip=0, limit=50)
job = client.music.get(job_id=123)
client.music.delete(job_id=123)
```

### cURL

```bash
# Song with lyrics
curl -X POST "https://api.audiopod.ai/api/v1/music/text2music" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"upbeat pop, synth, 120bpm, female vocals", "lyrics":"Walking down the street...", "audio_duration":60}'

# Rap
curl -X POST "https://api.audiopod.ai/api/v1/music/text2rap" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Lo-Fi Hip Hop, male rap, 100 BPM", "lyrics":"Started from the bottom...", "audio_duration":60}'

# Instrumental
curl -X POST "https://api.audiopod.ai/api/v1/music/prompt2instrumental" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"ambient soundscape, uplifting", "audio_duration":30}'

# Samples/loops
curl -X POST "https://api.audiopod.ai/api/v1/music/text2samples" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"drum loop, sad mood", "audio_duration":15}'

# Vocals only
curl -X POST "https://api.audiopod.ai/api/v1/music/lyric2vocals" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"clean vocals, happy", "lyrics":"Eternal chorus of unity...", "audio_duration":30}'

# Check job status / get result
curl "https://api.audiopod.ai/api/v1/music/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Get genre presets
curl "https://api.audiopod.ai/api/v1/music/presets" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List jobs
curl "https://api.audiopod.ai/api/v1/music/jobs?skip=0&limit=50" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Delete job
curl -X DELETE "https://api.audiopod.ai/api/v1/music/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"
```

### Parameters

| Field | Required | Description |
|-------|----------|-------------|
| prompt | yes | Style/genre description |
| lyrics | for song/rap/vocals | Song lyrics with verse/chorus structure |
| audio_duration | no | Duration in seconds (default: 30) |
| genre_preset | no | Genre preset name (from presets endpoint) |
| display_name | no | Track display name |

---

## Stem Separation

Split audio into individual instrument/vocal tracks.

### Modes

| Mode | Stems | Output | Use Case |
|------|-------|--------|----------|
| single | 1 | Specified stem only | Vocal isolation, drum extraction |
| two | 2 | vocals + instrumental | Karaoke tracks |
| four | 4 | vocals, drums, bass, other | Standard remixing (default) |
| six | 6 | + guitar, piano | Full instrument separation |
| producer | 8 | + kick, snare, hihat | Beat production |
| studio | 12 | + cymbals, sub_bass, synth | Professional mixing |
| mastering | 16 | Maximum detail | Forensic analysis |

**Single stem options:** vocals, drums, bass, guitar, piano, other

### Python SDK

```python
# Sync: extract and wait for result
result = client.stems.separate(
    url="https://youtube.com/watch?v=VIDEO_ID",
    mode="six",
    timeout=600
)
for stem, url in result["download_urls"].items():
    print(f"{stem}: {url}")

# From local file
result = client.stems.separate(file="/path/to/song.mp3", mode="four")

# Single stem extraction
result = client.stems.separate(
    url="https://youtube.com/watch?v=ID",
    mode="single",
    stem="vocals"
)

# Async: submit then poll
job = client.stems.extract(url="https://youtube.com/watch?v=ID", mode="six")
print(f"Job ID: {job['id']}")
status = client.stems.status(job["id"])
# or wait:
result = client.stems.wait_for_completion(job["id"], timeout=600)

# List available modes
modes = client.stems.modes()

# Job management
jobs = client.stems.list(skip=0, limit=50, status="COMPLETED")
job = client.stems.get(job_id=1234)
client.stems.delete(job_id=1234)
```

### cURL

```bash
# Extract from URL
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "url=https://youtube.com/watch?v=VIDEO_ID" \
  -F "mode=six"

# Extract from file
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "file=@/path/to/song.mp3" \
  -F "mode=four"

# Single stem
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "url=URL" \
  -F "mode=single" \
  -F "stem=vocals"

# Check job status
curl "https://api.audiopod.ai/api/v1/stem-extraction/status/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List available modes
curl "https://api.audiopod.ai/api/v1/stem-extraction/modes" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List jobs (filter by status: PENDING, PROCESSING, COMPLETED, FAILED)
curl "https://api.audiopod.ai/api/v1/stem-extraction/jobs?skip=0&limit=50&status=COMPLETED" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Get specific job
curl "https://api.audiopod.ai/api/v1/stem-extraction/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Delete job
curl -X DELETE "https://api.audiopod.ai/api/v1/stem-extraction/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"
```

### Response Format

```json
{
  "id": 1234,
  "status": "COMPLETED",
  "download_urls": {
    "vocals": "https://...",
    "drums": "https://...",
    "bass": "https://...",
    "other": "https://..."
  },
  "quality_scores": {
    "vocals": 0.95,
    "drums": 0.88
  }
}
```

---

## Text to Speech

Generate speech from text with 50+ voices in 60+ languages. Supports voice cloning.

### Voice Types

- **50+ production-ready voices** — multilingual, supporting 60+ languages with auto-detection
- **Custom clones** — clone any voice with ~5 seconds of audio sample

### Python SDK

```python
# Generate speech and wait for result
result = client.voice.generate(
    text="Hello, world! This is a test.",
    voice_id=123,
    speed=1.0
)
print(result["output_url"])

# Async: submit then poll
job = client.voice.speak(
    text="Hello world",
    voice_id=123,
    speed=1.0
)
status = client.voice.get_job(job["id"])
result = client.voice.wait_for_completion(job["id"], timeout=300)

# List all available voices
voices = client.voice.list()
for v in voices:
    print(f"{v['id']}: {v['name']}")

# Clone a voice (needs ~5 sec audio sample)
new_voice = client.voice.create(
    name="My Voice Clone",
    audio_file="./sample.mp3",
    description="Cloned from recording"
)

# Get/delete voice
voice = client.voice.get(voice_id=123)
client.voice.delete(voice_id=123)
```

### cURL (Raw HTTP — most reliable)

```bash
# List all voices
curl "https://api.audiopod.ai/api/v1/voice/voice-profiles" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Generate speech (FORM DATA, not JSON!)
curl -X POST "https://api.audiopod.ai/api/v1/voice/voices/{VOICE_UUID}/generate" \
  -H "Authorization: Bearer $AUDIOPOD_API_KEY" \
  -d "input_text=Hello world, this is a test" \
  -d "audio_format=mp3" \
  -d "speed=1.0"

# Poll job status
curl "https://api.audiopod.ai/api/v1/voice/tts-jobs/{JOB_ID}/status" \
  -H "Authorization: Bearer $AUDIOPOD_API_KEY"

# SDK-style endpoints (alternative)
# Generate via SDK endpoint
curl -X POST 

... (truncated)