Back to Skills
    🦞

    vgl

    Write structured VGL (Visual Generation Language) JSON prompts for Bria's FIBO.

    By @galbria
    View on GitHub
    SKILL.md
    ---
    name: vgl
    description: Write structured VGL (Visual Generation Language) JSON prompts for Bria's FIBO image generation models. Use this skill when creating detailed image descriptions in JSON format for text-to-image generation, image editing, inpainting, outpainting, background generation, or captioning. Triggers include requests to write structured prompts, create VGL JSON, describe images for AI generation, or work with Bria/FIBO's structured_prompt format. Also use when converting natural language image requests into the deterministic JSON schema required by FIBO models.
    ---
    
    # Bria VGL Prompt Writing
    
    Generate structured JSON prompts for Bria's FIBO models using Visual Generation Language (VGL).
    
    > **Related Skill**: Use **[bria-ai](../bria-ai/SKILL.md)** to execute these VGL prompts via the Bria API. VGL defines the structured prompt format; bria-ai handles generation, editing, and background removal.
    
    ## Core Concept
    
    VGL replaces ambiguous natural language prompts with deterministic JSON that explicitly declares every visual attribute: objects, lighting, camera settings, composition, and style. This ensures reproducible, controllable image generation.
    
    ## Operation Modes
    
    | Mode | Input | Output | Use Case |
    |------|-------|--------|----------|
    | **Generate** | Text prompt | VGL JSON | Create new image from description |
    | **Edit** | Image + instruction | VGL JSON | Modify reference image |
    | **Edit_with_Mask** | Masked image + instruction | VGL JSON | Fill grey masked regions |
    | **Caption** | Image only | VGL JSON | Describe existing image |
    | **Refine** | Existing JSON + edit | Updated VGL JSON | Modify existing prompt |
    
    ## JSON Schema
    
    Output a single valid JSON object with these required keys:
    
    ### 1. `short_description` (String)
    Concise summary of image content, max 200 words. Include key subjects, actions, setting, and mood.
    
    ### 2. `objects` (Array, max 5 items)
    Each object requires:
    
    ```json
    {
      "description": "Detailed description, max 100 words",
      "location": "center | top-left | bottom-right foreground | etc.",
      "relative_size": "small | medium | large within frame",
      "shape_and_color": "Basic shape and dominant color",
      "texture": "smooth | rough | metallic | furry | fabric | etc.",
      "appearance_details": "Notable visual details",
      "relationship": "Relationship to other objects",
      "orientation": "upright | tilted 45 degrees | facing left | horizontal | etc."
    }
    ```
    
    **Human subjects** add:
    ```json
    {
      "pose": "Body position description",
      "expression": "winking | joyful | serious | surprised | calm",
      "clothing": "Attire description",
      "action": "What the person is doing",
      "gender": "Gender description",
      "skin_tone_and_texture": "Skin appearance"
    }
    ```
    
    **Object clusters** add:
    ```json
    {
      "number_of_objects": 3
    }
    ```
    
    **Size guidance**: If a person is the main subject, use `"medium-to-large"` or `"large within frame"`.
    
    ### 3. `background_setting` (String)
    Overall environment, setting, and background elements not in `objects`.
    
    ### 4. `lighting` (Object)
    ```json
    {
      "conditions": "bright daylight | dim indoor | studio lighting | golden hour | blue hour | overcast",
      "direction": "front-lit | backlit | side-lit from left | top-down",
      "shadows": "long, soft shadows | sharp, defined shadows | minimal shadows"
    }
    ```
    
    ### 5. `aesthetics` (Object)
    ```json
    {
      "composition": "rule of thirds | symmetrical | centered | leading lines | medium shot | close-up",
      "color_scheme": "monochromatic blue | warm complementary | high contrast | pastel",
      "mood_atmosphere": "serene | energetic | mysterious | joyful | dramatic | peaceful"
    }
    ```
    For people as main subject, specify shot type in composition: `"medium shot"`, `"close-up"`, `"portrait composition"`.
    
    ### 6. `photographic_characteristics` (Object)
    ```json
    {
      "depth_of_field": "shallow | deep | bokeh background",
      "focus": "sharp focus on subject | soft focus | motion blur",
      "camera_angle": "eye-level | low angle | high angle | dutch angle | bird's-eye",
      "lens_focal_length": "wide-angle | 50mm standard | 85mm portrait | telephoto | macro"
    }
    ```
    **For people**: Prefer `"standard lens (35mm-50mm)"` or `"portrait lens (50mm-85mm)"`. Avoid wide-angle unless specified.
    
    ### 7. `style_medium` (String)
    `"photograph"` | `"oil painting"` | `"watercolor"` | `"3D render"` | `"digital illustration"` | `"pencil sketch"`
    
    Default to `"photograph"` unless explicitly requested otherwise.
    
    ### 8. `artistic_style` (String)
    If not photograph, describe characteristics in max 3 words: `"impressionistic, vibrant, textured"`
    
    For photographs, use `"realistic"` or similar.
    
    ### 9. `context` (String)
    Describe the image type/purpose:
    - `"High-fashion editorial photograph for magazine spread"`
    - `"Concept art for fantasy video game"`
    - `"Commercial product photography for e-commerce"`
    
    ### 10. `text_render` (Array)
    **Default: empty array `[]`**
    
    Only populate if user explicitly provides exact text content:
    ```json
    {
      "text": "Exact text from user (never placeholder)",
      "location": "center | top-left | bottom",
      "size": "small | medium | large",
      "color": "white | red | blue",
      "font": "serif typeface | sans-serif | handwritten | bold impact",
      "appearance_details": "Metallic finish | 3D effect | etc."
    }
    ```
    Exception: Universal text integral to objects (e.g., "STOP" on stop sign).
    
    ### 11. `edit_instruction` (String)
    Single imperative command describing the edit/generation.
    
    ## Edit Instruction Formats
    
    ### For Standard Edits (no mask)
    Start with action verb, describe changes, never reference "original image":
    
    | Category | Rewritten Instruction |
    |----------|----------------------|
    | Style change | `Turn the image into the cartoon style.` |
    | Object attribute | `Change the dog's color to black and white.` |
    | Add element | `Add a wide-brimmed felt hat to the subject.` |
    | Remove object | `Remove the book from the subject's hands.` |
    | Replace object | `Change the rose to a bright yellow sunflower.` |
    | Lighting | `Change the lighting from dark and moody to bright and vibrant.` |
    | Composition | `Change the perspective to a wider shot.` |
    | Text change | `Change the text "Happy Anniversary" to "Hello".` |
    | Quality | `Refine the image to obtain increased clarity and sharpness.` |
    
    ### For Masked Region Edits
    Reference "masked regions" or "masked area" as target:
    
    | Intent | Rewritten Instruction |
    |--------|----------------------|
    | Object generation | `Generate a white rose with a blue center in the masked region.` |
    | Extension | `Extend the image into the masked region to create a scene featuring...` |
    | Background fill | `Create the following background in the masked region: A vast ocean extending to horizon.` |
    | Atmospheric fill | `Fill the background masked area with a clear, bright blue sky with wispy clouds.` |
    | Subject restoration | `Restore the area in the mask with a young woman.` |
    | Environment infill | `Create inside the masked area: a greenhouse with rows of plants under glass ceiling.` |
    
    ## Fidelity Rules
    
    ### Standard Edit Mode
    Preserve ALL visual properties unless explicitly changed by instruction:
    - Subject identity, pose, appearance
    - Object existence, location, size, orientation
    - Composition, camera angle, lens characteristics
    - Style/medium
    
    Only change what the edit strictly requires.
    
    ### Masked Edit Mode
    - Preserve all visible (non-masked) portions exactly
    - Fill grey masked regions to blend seamlessly with unmasked areas
    - Match existing style, lighting, and subject matter
    - Never describe grey masks—describe content that fills them
    
    ## Example Output
    
    ```json
    {
      "short_description": "A professional businesswoman in a navy blazer stands confidently in a modern glass office, holding a tablet. Natural daylight streams through floor-to-ceiling windows, creating a warm, productive atmosphere.",
      "objects": [
        {
          "description": "A confident businesswoman in her 30s with shoulder-length dark hair, wearing a tailored navy blazer over a white blouse. She holds a tablet in her left hand while gesturing naturally with her right.",
          "location": "center-right",
          "relative_size": "large within frame",
          "shape_and_color": "Human figure, navy and white clothing",
          "texture": "smooth fabric, professional attire",
          "appearance_details": "Minimal jewelry, well-groomed professional appearance",
          "relationship": "Main subject, interacting with tablet",
          "orientation": "facing slightly left, three-quarter view",
          "pose": "Standing upright, relaxed professional stance",
          "expression": "confident, approachable smile",
          "clothing": "Tailored navy blazer, white silk blouse, dark trousers",
          "action": "Presenting or reviewing information on tablet",
          "gender": "female",
          "skin_tone_and_texture": "Medium warm skin tone, healthy smooth complexion"
        },
        {
          "description": "A modern tablet device with a bright display showing charts and graphs",
          "location": "center, held by subject",
          "relative_size": "small",
          "shape_and_color": "Rectangular, silver frame with illuminated screen",
          "texture": "smooth glass and metal",
          "appearance_details": "Thin profile, business application visible on screen",
          "relationship": "Held by businesswoman, focus of her attention",
          "orientation": "vertical, screen facing viewer at slight angle",
          "pose": null,
          "expression": null,
          "clothing": null,
          "action": null,
          "gender": null,
          "skin_tone_and_texture": null,
          "number_of_objects": null
        }
      ],
      "background_setting": "Modern corporate office interior with floor-to-ceiling windows overlooking a city skyline. Minimalist furniture in neutral tones, potted plants adding touches of green.",
      "lighting": {
        "conditions": "bright natural daylight",
        "direction": "side-lit from left through windows",
        "shadows": "soft, natural shadows"
      },
      "aesthetics": {
        "composition": "rule of thirds, medi
    
    ... (truncated)