Back to Skills
    šŸ¦ž

    camoufox-stealth

    C++ level anti-bot browser automation using Camoufox

    By @kesslerio
    View on GitHub
    SKILL.md
    ---
    name: camoufox-stealth
    description: C++ level anti-bot browser automation using Camoufox (patched Firefox) in isolated containers. Bypasses Cloudflare Turnstile, Datadome, Airbnb, Yelp. Superior to Chrome-based solutions (undetected-chromedriver, puppeteer-stealth) which only patch at JS level. Use when standard Playwright/Selenium gets blocked.
    metadata:
      openclaw:
        emoji: "🦊"
        requires:
          bins: ["distrobox"]
          env: []
    ---
    
    # Camoufox Stealth Browser 🦊
    
    **C++ level** anti-bot evasion using Camoufox — a custom Firefox fork with stealth patches compiled into the browser itself, not bolted on via JavaScript.
    
    ## Why Camoufox > Chrome-based Solutions
    
    | Approach | Detection Level | Tools |
    |----------|-----------------|-------|
    | **Camoufox (this skill)** | C++ compiled patches | Undetectable fingerprints baked into browser |
    | undetected-chromedriver | JS runtime patches | Can be detected by timing analysis |
    | puppeteer-stealth | JS injection | Patches applied after page load = detectable |
    | playwright-stealth | JS injection | Same limitations |
    
    **Camoufox patches Firefox at the source code level** — WebGL, Canvas, AudioContext fingerprints are genuinely spoofed, not masked by JavaScript overrides that anti-bot systems can detect.
    
    ## Key Advantages
    
    1. **C++ Level Stealth** — Fingerprint spoofing compiled into the browser, not JS hacks
    2. **Container Isolation** — Runs in distrobox, keeping your host system clean
    3. **Dual-Tool Approach** — Camoufox for browsers, curl_cffi for API-only (no browser overhead)
    4. **Firefox-Based** — Less fingerprinted than Chrome (everyone uses Chrome for bots)
    
    ## When to Use
    
    - Standard Playwright/Selenium gets blocked
    - Site shows Cloudflare challenge or "checking your browser"
    - Need to scrape Airbnb, Yelp, or similar protected sites
    - `puppeteer-stealth` or `undetected-chromedriver` stopped working
    - You need **actual** stealth, not JS band-aids
    
    ## Tool Selection
    
    | Tool | Level | Best For |
    |------|-------|----------|
    | **Camoufox** | C++ patches | All protected sites - Cloudflare, Datadome, Yelp, Airbnb |
    | **curl_cffi** | TLS spoofing | API endpoints only - no JS needed, very fast |
    
    ## Quick Start
    
    All scripts run in `pybox` distrobox for isolation.
    
    āš ļø **Use `python3.14` explicitly** - pybox may have multiple Python versions with different packages installed.
    
    ### 1. Setup (First Time)
    
    ```bash
    # Install tools in pybox (use python3.14)
    distrobox-enter pybox -- python3.14 -m pip install camoufox curl_cffi
    
    # Camoufox browser downloads automatically on first run (~700MB Firefox fork)
    ```
    
    ### 2. Fetch a Protected Page
    
    **Browser (Camoufox):**
    ```bash
    distrobox-enter pybox -- python3.14 scripts/camoufox-fetch.py "https://example.com" --headless
    ```
    
    **API only (curl_cffi):**
    ```bash
    distrobox-enter pybox -- python3.14 scripts/curl-api.py "https://api.example.com/endpoint"
    ```
    
    ## Architecture
    
    ```
    ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
    │                     OpenClaw Agent                       │
    ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
    │  distrobox-enter pybox -- python3.14 scripts/xxx.py         │
    ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
    │                      pybox Container                     │
    │         ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”  ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”               │
    │         │  Camoufox   │  │  curl_cffi  │               │
    │         │  (Firefox)  │  │  (TLS spoof)│               │
    │         ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜  ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜               │
    ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
    ```
    
    ## Tool Details
    
    ### Camoufox  
    - **What:** Custom Firefox build with C++ level stealth patches
    - **Pros:** Best fingerprint evasion, passes Turnstile automatically
    - **Cons:** ~700MB download, Firefox-based
    - **Best for:** All protected sites - Cloudflare, Datadome, Yelp, Airbnb
    
    ### curl_cffi
    - **What:** Python HTTP client with browser TLS fingerprint spoofing
    - **Pros:** No browser overhead, very fast
    - **Cons:** No JS execution, API endpoints only
    - **Best for:** Known API endpoints, mobile app reverse engineering
    
    ## Critical: Proxy Requirements
    
    **Datacenter IPs (AWS, DigitalOcean) = INSTANT BLOCK on Airbnb/Yelp**
    
    You MUST use residential or mobile proxies:
    
    ```python
    # Example proxy config
    proxy = "http://user:pass@residential-proxy.example.com:8080"
    ```
    
    See **[references/proxy-setup.md](references/proxy-setup.md)** for proxy configuration.
    
    ## Behavioral Tips
    
    Sites like Airbnb/Yelp use behavioral analysis. To avoid detection:
    
    1. **Warm up:** Don't hit target URL directly. Visit homepage first, scroll, click around.
    2. **Mouse movements:** Inject random mouse movements (Camoufox handles this).
    3. **Timing:** Add random delays (2-5s between actions), not fixed intervals.
    4. **Session stickiness:** Use same proxy IP for 10-30 min sessions, don't rotate every request.
    
    ## Headless Mode Warning
    
    āš ļø Old `--headless` flag is DETECTED. Options:
    
    1. **New Headless:** Use `headless="new"` (Chrome 109+)
    2. **Xvfb:** Run headed browser in virtual display
    3. **Headed:** Just run headed if you can (most reliable)
    
    ```bash
    # Xvfb approach (Linux)
    Xvfb :99 -screen 0 1920x1080x24 &
    export DISPLAY=:99
    python scripts/camoufox-fetch.py "https://example.com"
    ```
    
    ## Troubleshooting
    
    | Problem | Solution |
    |---------|----------|
    | "Access Denied" immediately | Use residential proxy |
    | Cloudflare challenge loops | Try Camoufox instead of Nodriver |
    | Browser crashes in pybox | Install missing deps: `sudo dnf install gtk3 libXt` |
    | TLS fingerprint blocked | Use curl_cffi with `impersonate="chrome120"` |
    | Turnstile checkbox appears | Add mouse movement, increase wait time |
    | `ModuleNotFoundError: camoufox` | Use `python3.14` not `python` or `python3` |
    | `greenlet` segfault (exit 139) | Python version mismatch - use `python3.14` explicitly |
    | `libstdc++.so.6` errors | NixOS lib path issue - use `python3.14` in pybox |
    
    ### Python Version Issues (NixOS/pybox)
    
    The `pybox` container may have multiple Python versions with separate site-packages:
    
    ```bash
    # Check which Python has camoufox
    distrobox-enter pybox -- python3.14 -c "import camoufox; print('OK')"
    
    # Wrong (may use different Python)
    distrobox-enter pybox -- python3.14 scripts/camoufox-session.py ...
    
    # Correct (explicit version)
    distrobox-enter pybox -- python3.14 scripts/camoufox-session.py ...
    ```
    
    If you get segfaults or import errors, always use `python3.14` explicitly.
    
    ## Examples
    
    ### Scrape Airbnb Listing
    
    ```bash
    distrobox-enter pybox -- python3.14 scripts/camoufox-fetch.py \
      "https://www.airbnb.com/rooms/12345" \
      --headless --wait 10 \
      --screenshot airbnb.png
    ```
    
    ### Scrape Yelp Business
    
    ```bash
    distrobox-enter pybox -- python3.14 scripts/camoufox-fetch.py \
      "https://www.yelp.com/biz/some-restaurant" \
      --headless --wait 8 \
      --output yelp.html
    ```
    
    ### API Scraping with TLS Spoofing
    
    ```bash
    distrobox-enter pybox -- python3.14 scripts/curl-api.py \
      "https://api.yelp.com/v3/businesses/search?term=coffee&location=SF" \
      --headers '{"Authorization": "Bearer xxx"}'
    ```
    
    ## Session Management
    
    Persistent sessions allow reusing authenticated state across runs without re-logging in.
    
    ### Quick Start
    
    ```bash
    # 1. Login interactively (headed browser opens)
    distrobox-enter pybox -- python3.14 scripts/camoufox-session.py \
      --profile airbnb --login "https://www.airbnb.com/account-settings"
    
    # Complete login in browser, then press Enter to save session
    
    # 2. Reuse session in headless mode
    distrobox-enter pybox -- python3.14 scripts/camoufox-session.py \
      --profile airbnb --headless "https://www.airbnb.com/trips"
    
    # 3. Check session status
    distrobox-enter pybox -- python3.14 scripts/camoufox-session.py \
      --profile airbnb --status "https://www.airbnb.com"
    ```
    
    ### Flags
    
    | Flag | Description |
    |------|-------------|
    | `--profile NAME` | Named profile for session storage (required) |
    | `--login` | Interactive login mode - opens headed browser |
    | `--headless` | Use saved session in headless mode |
    | `--status` | Check if session appears valid |
    | `--export-cookies FILE` | Export cookies to JSON for backup |
    | `--import-cookies FILE` | Import cookies from JSON file |
    
    ### Storage
    
    - **Location:** `~/.stealth-browser/profiles/<name>/`
    - **Permissions:** Directory `700`, files `600`
    - **Profile names:** Letters, numbers, `_`, `-` only (1-63 chars)
    
    ### Cookie Handling
    
    - **Save:** All cookies from all domains stored in browser profile
    - **Restore:** Only cookies matching target URL domain are used
    - **SSO:** If redirected to Google/auth domain, re-authenticate once and profile updates
    
    ### Login Wall Detection
    
    The script detects session expiry using multiple signals:
    
    1. **HTTP status:** 401, 403
    2. **URL patterns:** `/login`, `/signin`, `/auth`
    3. **Title patterns:** "login", "sign in", etc.
    4. **Content keywords:** "captcha", "verify", "authenticate"
    5. **Form detection:** Password input fields
    
    If detected during `--headless` mode, you'll see:
    ```
    šŸ”’ Login wall signals: url-path, password-form
    ```
    
    Re-run with `--login` to refresh the session.
    
    ### Remote Login (SSH)
    
    Since `--login` requires a visible browser, you need display forwarding:
    
    **X11 Forwarding (Preferred):**
    ```bash
    # Connect with X11 forwarding
    ssh -X user@server
    
    # Run login (opens browser on your local machine)
    distrobox-enter pybox -- python3.14 scripts/camoufox-session.py \
      --profile mysite --login "https://example.com"
    ```
    
    **VNC Alternative:**
    ```bash
    # On server: start VNC session
    vncserver :1
    
    # On client: connect to VNC
    vncviewer server:1
    
    # In VNC session: run login
    distrobox-enter pybox -- python3.14 scripts/camoufox-session.py \
      --profile mysite --login "https://example.com"
    ```
    
    ### Security Notes
    
    āš ļø **Cookies are credentials.** Treat profile directories like passwords:
    - Profile dirs have `chmod 700` (owner only)
    - Cookie exports have `chmod 600`
    - Don't share profiles or exported cookies over insecure channels
    - Consider encrypting backups
    
    ### Limitat
    
    ... (truncated)