🦞
ralph-loops

Read [SETUP.md](./SETUP.md) first to install dependencies
SKILL.md
# Ralph Loops Skill

> **First time?** Read [SETUP.md](./SETUP.md) first to install dependencies and verify your setup.

Autonomous AI agent loops for iterative development. Based on Geoffrey Huntley's Ralph Wiggum technique, as documented by Clayton Farr.

**Script:** `skills/ralph-loops/scripts/ralph-loop.mjs`
**Dashboard:** `skills/ralph-loops/dashboard/` (run with `node server.mjs`)
**Templates:** `skills/ralph-loops/templates/`
**Archive:** `~/clawd/logs/ralph-archive/`

---

## ⚠️ Known Issues

### Claude Code Version Compatibility

**Claude Code 2.1.29 has a critical bug** that spawns orphaned sub-agents consuming 99% CPU. Iterations fail with "exit code null" on first run.

**Fix:** Downgrade to 2.1.25:
```bash
npm install -g @anthropic-ai/claude-code@2.1.25
```

**Verify:**
```bash
claude --version  # Should show 2.1.25
```

This was discovered 2026-02-01. Check if newer versions fix the issue before upgrading.

---

## ⚠️ Don't Block the Conversation!

When running a Ralph loop, **don't monitor it synchronously**. The loop runs as a separate Claude CLI process — you can keep chatting.

**❌ Wrong (blocks conversation):**
```
Start loop → sleep 60 → poll → sleep 60 → poll → ... (6 minutes of silence)
```

**✅ Right (stays responsive):**
```
Start loop → "It's running, I'll check periodically" → keep chatting → check on heartbeats
```

**How to monitor without blocking:**
1. Start the loop with `node ralph-loop.mjs ...` (runs in background)
2. Tell human: "Loop running. I'll check progress periodically or you can ask."
3. Check via `process poll <sessionId>` when asked or during heartbeats
4. Use the dashboard at http://localhost:3939 for real-time visibility

**The loop is autonomous** — that's the whole point. Don't babysit it at the cost of ignoring your human.

---

## Trigger Phrases

When human says:

| Phrase | Action |
|--------|--------|
| **"Interview me about system X"** | Start Phase 1 requirements interview |
| **"Start planning system X"** | Run `./loop.sh plan` (needs specs first) |
| **"Start building system X"** | Run `./loop.sh build` (needs plan first) |
| **"Ralph loop over X"** | **ASK which phase** (see below) |

### When Human Says "Ralph Loop" — Clarify the Phase!

Don't assume which phase. Ask:

> "Which type of Ralph loop are we doing?
> 
> 1️⃣ **Interview** — I'll ask you questions to build specs (Phase 1)
> 2️⃣ **Planning** — I'll iterate on an implementation plan (Phase 2)  
> 3️⃣ **Building** — I'll implement from a plan, one task per iteration (Phase 3)
> 4️⃣ **Generic** — Simple iterative refinement on a single topic"

**Then proceed based on their answer:**

| Choice | Action |
|--------|--------|
| Interview | Use `templates/requirements-interview.md` protocol |
| Planning | Need specs first → run planning loop with `PROMPT_plan.md` |
| Building | Need plan first → run build loop with `PROMPT_build.md` |
| Generic | Create prompt file, run `ralph-loop.mjs` directly |

### Generic Ralph Loop Flow (Phase 4)

For simple iterative refinement (not full system builds):

1. **Clarify the task** — What exactly should be improved/refined?
2. **Create a prompt file** — Save to `/tmp/ralph-prompt-<task>.md`
3. **Set completion criteria** — What signals "done"?
4. **Run the loop:**
   ```bash
   node skills/ralph-loops/scripts/ralph-loop.mjs \
     --prompt "/tmp/ralph-prompt-<task>.md" \
     --model opus \
     --max 10 \
     --done "RALPH_DONE"
   ```
5. **Or spawn as sub-agent** for long-running tasks

---

## Core Philosophy

> "Human roles shift from 'telling the agent what to do' to 'engineering conditions where good outcomes emerge naturally through iteration."
> — Clayton Farr

Three principles drive everything:

1. **Context is scarce** — With ~176K usable tokens from a 200K window, keep each iteration lean
2. **Plans are disposable** — A drifting plan is cheaper to regenerate than salvage
3. **Backpressure beats direction** — Engineer environments where wrong outputs get rejected automatically

---

## Three-Phase Workflow

```
┌─────────────────────────────────────────────────────────────────────┐
│  Phase 1: REQUIREMENTS                                              │
│  Human + LLM conversation → JTBD → Topics → specs/*.md              │
├─────────────────────────────────────────────────────────────────────┤
│  Phase 2: PLANNING                                                  │
│  Gap analysis (specs vs code) → IMPLEMENTATION_PLAN.md              │
├─────────────────────────────────────────────────────────────────────┤
│  Phase 3: BUILDING                                                  │
│  One task per iteration → fresh context → backpressure → commit     │
└─────────────────────────────────────────────────────────────────────┘
```

### Phase 1: Requirements (Talk to Human)

**Goal:** Understand what to build BEFORE building it.

This is the most important phase. Use structured conversation to:

1. **Identify Jobs to Be Done (JTBD)**
   - What user need or outcome are we solving?
   - Not features — outcomes

2. **Break JTBD into Topics of Concern**
   - Each topic = one distinct aspect/component
   - Use the "one sentence without 'and'" test
   - ✓ "The color extraction system analyzes images to identify dominant colors"
   - ✗ "The user system handles authentication, profiles, and billing" → 3 topics

3. **Create Specs for Each Topic**
   - One markdown file per topic in `specs/`
   - Capture requirements, acceptance criteria, edge cases

**Template:** `templates/requirements-interview.md`

### Phase 2: Planning (Gap Analysis)

**Goal:** Create a prioritized task list without implementing anything.

Uses `PROMPT_plan.md` in the loop:
- Study all specs
- Study existing codebase
- Compare specs vs code (gap analysis)
- Generate `IMPLEMENTATION_PLAN.md` with prioritized tasks
- **NO implementation** — planning only

Usually completes in 1-2 iterations.

### Phase 3: Building (One Task Per Iteration)

**Goal:** Implement tasks one at a time with fresh context.

Uses `PROMPT_build.md` in the loop:
1. Read `IMPLEMENTATION_PLAN.md`
2. Pick the most important task
3. Investigate codebase (don't assume not implemented)
4. Implement
5. Run validation (backpressure)
6. Update plan, commit
7. Exit → fresh context → next iteration

**Key insight:** One task per iteration keeps context lean. The agent stays in the "smart zone" instead of accumulating cruft.

**Why fresh context matters:**
- **No accumulated mistakes** — Each iteration starts clean; previous errors don't compound
- **Full context budget** — 200K tokens for THIS task, not shared with finished work
- **Reduced hallucination** — Shorter contexts = more grounded responses
- **Natural checkpoints** — Each commit is a save point; easy to revert single iterations

---

## File Structure

```
project/
├── loop.sh                    # Ralph loop script
├── PROMPT_plan.md             # Planning mode instructions
├── PROMPT_build.md            # Building mode instructions  
├── AGENTS.md                  # Operational guide (~60 lines max)
├── IMPLEMENTATION_PLAN.md     # Prioritized task list (generated)
└── specs/                     # Requirement specs
    ├── topic-a.md
    ├── topic-b.md
    └── ...
```

### File Purposes

| File | Purpose | Who Creates |
|------|---------|-------------|
| `specs/*.md` | Source of truth for requirements | Human + Phase 1 |
| `PROMPT_plan.md` | Instructions for planning mode | Copy from template |
| `PROMPT_build.md` | Instructions for building mode | Copy from template |
| `AGENTS.md` | Build/test/lint commands | Human + Ralph |
| `IMPLEMENTATION_PLAN.md` | Task list with priorities | Ralph (Phase 2) |

### Project Organization (Systems)

For Clawdbot systems, each Ralph project lives in `<workspace>/systems/<name>/`:

```
systems/
├── health-tracker/           # Example system
│   ├── specs/
│   │   ├── daily-tracking.md
│   │   └── test-scheduling.md
│   ├── PROMPT_plan.md
│   ├── PROMPT_build.md
│   ├── AGENTS.md
│   ├── IMPLEMENTATION_PLAN.md  # ← exists = past Phase 1
│   └── src/
└── activity-planner/
    ├── specs/                  # ← empty = still in Phase 1
    └── ...
```

### Phase Detection (Auto)

Detect current phase by checking what files exist:

| What Exists | Current Phase | Next Action |
|-------------|---------------|-------------|
| Nothing / empty `specs/` | Phase 1: Requirements | Run requirements interview |
| `specs/*.md` but no `IMPLEMENTATION_PLAN.md` | Ready for Phase 2 | Run `./loop.sh plan` |
| `specs/*.md` + `IMPLEMENTATION_PLAN.md` | Phase 2 or 3 | Review plan, run `./loop.sh build` |
| Plan shows all tasks complete | Done | Archive or iterate |

**Quick check:**
```bash
# What phase are we in?
[ -z "$(ls specs/ 2>/dev/null)" ] && echo "Phase 1: Need specs" && exit
[ ! -f IMPLEMENTATION_PLAN.md ] && echo "Phase 2: Need plan" && exit
echo "Phase 3: Ready to build (or done)"
```

---

## JTBD Breakdown

The hierarchy matters:

```
JTBD (Job to Be Done)
└── Topic of Concern (1 per spec file)
    └── Tasks (many per topic, in IMPLEMENTATION_PLAN.md)
```

**Example:**
- **JTBD:** "Help designers create mood boards"
- **Topics:**
  - Image collection → `specs/image-collection.md`
  - Color extraction → `specs/color-extraction.md`
  - Layout system → `specs/layout-system.md`
  - Sharing → `specs/sharing.md`
- **Tasks:** Each spec generates multiple implementation tasks

### Topic Scope Test

> Can you describe the topic in one sentence without "and"?

If you need "and" or "also", it's probably multiple topics. Split it.

**When to split:**
- Multiple verbs in the description → separate topics
- Different user personas involved → separate topics
- Could be implemented by different teams → separate topics
- Has its own failure modes → probably its own topic

**Example split:**
```
❌ "User management handles registration, authentication, profiles, and permissions"

✅ Split into:
   - "Registration creates new user accounts from email/password"
   - "Authentic

... (truncated)