Back to Skills
    🦞

    amazon-competitor-analyzer

    Scrapes Amazon product data from ASINs

    By @phheng
    View on GitHub
    SKILL.md
    ---
    name: amazon-competitor-analyzer
    description: Scrapes Amazon product data from ASINs using browseract.com automation API and performs surgical competitive analysis. Compares specifications, pricing, review quality, and visual strategies to identify competitor moats and vulnerabilities.
    ---
    
    # Amazon Competitor Analyzer
    
    This skill scrapes Amazon product data from user-provided ASINs using browseract.com's browser automation API and performs deep competitive analysis. It compares specifications, pricing, review quality, and visual strategies to identify competitor moats and vulnerabilities.
    
    ## When to Use This Skill
    
    - Competitive research: Input multiple ASINs to understand market landscape
    - Pricing strategy analysis: Compare price bands across similar products
    - Specification benchmarking: Deep dive into technical specs and feature differences
    - Review insights: Analyze review quality, quantity, and sentiment patterns
    - Visual strategy research: Evaluate main images, A+ content, and brand visuals
    - Market opportunity discovery: Identify gaps and potential threats
    - Product optimization: Develop optimization strategies based on competitor analysis
    - New product research: Support new product development with market data
    
    ## What This Skill Does
    
    1. **ASIN Data Collection**: Automatically extract product title, price, rating, review count, images, and core data using BrowserAct workflow templates
    2. **Specification Extraction**: Deep extraction of technical specs, features, and materials
    3. **Review Quality Analysis**: Analyze review patterns, keywords, and sentiment
    4. **Visual Strategy Assessment**: Evaluate main images, A+ page design, and brand consistency
    5. **Multi-Dimensional Comparison**: Side-by-side comparison of key metrics across products
    6. **Moat Identification**: Identify core competitive advantages and barriers
    7. **Vulnerability Discovery**: Find competitor weaknesses and market opportunities
    8. **Structured Output**: Generate JSON and Markdown analysis reports
    
    ## Features
    1. **No hallucinations, ensuring stable and accurate data extraction**: Pre-set workflows eliminate AI-generated hallucinations.
    2.
    **No CAPTCHA challenges**: Built-in bypass mechanisms eliminate the need to handle reCAPTCHA or other verification challenges.
    3.
    **No IP Access Restrictions or Geofencing**: Overcomes geographic IP limitations for stable global access.
    4.
    **Faster Execution Speed**: Tasks complete more rapidly than purely AI-driven browser automation solutions.
    5. **Exceptional Cost Efficiency**: Significantly reduces data acquisition costs compared to token-intensive AI solutions.
    
    
    
    ## Prerequisites
    
    ### 1. BrowserAct.com Account Setup
    
    You need a BrowserAct.com account and API key:
    
    1. Visit [browseract.com](https://browseract.com)
    2. Sign up for an account
    3. Navigate to API settings
    4. Generate an API key
    5. Store your API key securely (environment variables recommended)
    
    ### 2. Environment Configuration
    
    Set your API key as an environment variable:
    
    ```bash
    export BROWSERACT_API_KEY="your-api-key-here"
    ```
    
    Or create a `.env` file:
    
    ```
    BROWSERACT_API_KEY=your-api-key-here
    ```
    
    ## How to Use
    
    ### Basic Competitor Analysis
    
    ```
    Analyze the following Amazon ASIN: B09XYZ12345
    ```
    
    ```
    Compare these three products: B07ABC11111, B07DEF22222, B07GHI33333
    ```
    
    ### Deep Specification Comparison
    
    ```
    Analyze the technical specification differences: B09XYZ12345, B09ABC11111
    ```
    
    ### Review Quality Analysis
    
    ```
    Analyze review quality and feedback: B09XYZ12345, B07DEF22222
    ```
    
    ### Visual Strategy Research
    
    ```
    Research main image and visual presentation strategies: B09XYZ12345, B09ABC11111
    ```
    
    ### Complete Competitive Analysis
    
    ```
    Analyze competitor landscape: B09XYZ12345, B07DEF22222, B07GHI33333, B09JKL44444
    ```
    
    ## Instructions
    
    When a user requests Amazon competitor analysis:
    
    ### 1. ASIN Identification and Validation
    
    Identify ASINs from user input:
    
    - **ASIN Format**: 10-character alphanumeric (e.g., B09XYZ12345)
    - **Validation**: Check format compliance with Amazon ASIN standards
    - **URL Parsing**: Extract ASIN from Amazon product URLs
    - **Error Handling**: Prompt user to correct invalid ASINs
    
    ### 2. BrowserAct API Implementation
    
    ```python
    """
    BrowserAct API - Run Template Task and Wait for Completion
    Scenarios for beginners - Synchronous task execution with official templates
    """
    import os
    import time
    import traceback
    import json
    import requests
    
    # ============ Configuration Area ============
    # API Key - Get from: https://www.browseract.com/reception/integrations
    API_KEY = os.getenv("BROWSERACT_API_KEY", "your-api-key-here")
    
    # Workflow Template ID for Amazon product scraping
    # You can get it from:
    # - Run: python Workflow-Python/11.list_official_workflow_templates.py
    # - Or visit: https://www.browseract.com/template?platformType=0
    WORKFLOW_TEMPLATE_ID = "77814333389670716"
    
    # Polling configuration
    POLL_INTERVAL = 5  # Check task status every 5 seconds
    MAX_WAIT_TIME = 1800  # Maximum wait time: 30 minutes (1800 seconds)
    
    API_BASE_URL = "https://api.browseract.com/v2/workflow"
    
    
    def create_input_parameters(asins):
        """Create input parameters for the workflow template"""
        return [
            {
                "name": "ASIN",
                "value": asin.strip()
            }
            for asin in asins if asin.strip()
        ]
    
    
    def run_task_by_template(workflow_template_id, input_parameters):
        """Start a task using template"""
        headers = {
            "Authorization": f"Bearer {API_KEY}"
        }
        
        data = {
            "workflow_template_id": workflow_template_id,
            "input_parameters": input_parameters,
        }
        
        api_url = f"{API_BASE_URL}/run-task-by-template"
        response = requests.post(api_url, json=data, headers=headers)
        
        if response.status_code == 200:
            result = response.json()
            task_id = result["id"]
            print(f"Task started successfully, Task ID: {task_id}")
            if "profileId" in result:
                print(f"   Profile ID: {result['profileId']}")
            return task_id
        else:
            print(f"Failed to start task: {response.json()}")
            return None
    
    
    def get_task_status(task_id):
        """Get task status"""
        headers = {
            "Authorization": f"Bearer {API_KEY}"
        }
        
        api_url = f"{API_BASE_URL}/get-task-status?task_id={task_id}"
        try:
            response = requests.get(api_url, headers=headers, timeout=30)
            
            if response.status_code == 200:
                return response.json().get("status")
            else:
                print(f"Failed to get task status: {response.json()}")
                return None
        except (requests.exceptions.SSLError, requests.exceptions.ConnectionError, 
                requests.exceptions.Timeout, requests.exceptions.RequestException) as e:
            # Network error, will retry in next polling cycle
            return None
    
    
    def get_task(task_id):
        """Get detailed task information and results"""
        headers = {
            "Authorization": f"Bearer {API_KEY}"
        }
        
        api_url = f"{API_BASE_URL}/get-task?task_id={task_id}"
        try:
            response = requests.get(api_url, headers=headers, timeout=30)
            
            if response.status_code == 200:
                return response.json()
            else:
                print(f"Failed to get task details: {response.json()}")
                return None
        except (requests.exceptions.SSLError, requests.exceptions.ConnectionError, 
                requests.exceptions.Timeout, requests.exceptions.RequestException) as e:
            print(f"Network error while getting task details: {type(e).__name__}")
            return None
    
    
    def wait_for_task_completion(task_id):
        """Wait for task completion with progress updates"""
        start_time = time.time()
        previous_status = None
        
        print(f"Waiting for task completion (max wait time: {MAX_WAIT_TIME // 60} minutes)...")
        
        while True:
            # Check if timeout
            elapsed_time = time.time() - start_time
            if elapsed_time > MAX_WAIT_TIME:
                print(f"Wait timeout (waited {elapsed_time:.0f} seconds)")
                return None
            
            # Get task status
            status = get_task_status(task_id)
            
            if status is None:
                # Network error or API error, continue waiting
                elapsed = int(elapsed_time)
                print(f"   Network error, retrying... (waited {elapsed} seconds)", end="\r")
            elif status == "finished":
                print(f"Task completed successfully!")
                return "finished"
            elif status == "failed":
                print(f"Task execution failed")
                return "failed"
            elif status == "canceled":
                print(f"Task canceled")
                return "canceled"
            else:
                # running, created, paused, etc.
                elapsed = int(elapsed_time)
                if status != previous_status:
                    print(f"   Status: {status} (waited {elapsed} seconds)", end="\r")
                    previous_status = status
                else:
                    print(f"   Status: {status} (waited {elapsed} seconds)", end="\r")
            
            # Wait before checking again
            time.sleep(POLL_INTERVAL)
    
    
    def scrape_amazon_products(asins):
        """
        Main function to scrape Amazon product data
        
        Args:
            asins: List of Amazon ASINs to scrape
            
        Returns:
            dict: Task result containing product data
        """
        if not asins:
            raise ValueError("No ASINs provided for scraping")
        
        # Create input parameters
        input_parameters = create_input_parameters(asins)
        
        print(f"Starting Amazon product scraping for {len(asins)} ASIN(s)...")
        print(f"ASINs: {[p['value'] for p in input_parameters]}")
        
        # Step 1: Start task using template
        task_id = run_task_by_template(WORKFLOW_TEMPLATE_ID, input_parameters)
        
        if task_id is None:
            raise Exception("Unable to start scraping task")
        
        # Step 2: Wait for task completion
        final_status = wait_for_task_completion(task_id)
        
        if final_statu
    
    ... (truncated)