🦞
vgl

Write structured VGL (Visual Generation Language) JSON prompts for Bria's FIBO.
SKILL.md
---
name: vgl
description: Write structured VGL (Visual Generation Language) JSON prompts for Bria's FIBO image generation models. Use this skill when creating detailed image descriptions in JSON format for text-to-image generation, image editing, inpainting, outpainting, background generation, or captioning. Triggers include requests to write structured prompts, create VGL JSON, describe images for AI generation, or work with Bria/FIBO's structured_prompt format. Also use when converting natural language image requests into the deterministic JSON schema required by FIBO models.
---

# Bria VGL Prompt Writing

Generate structured JSON prompts for Bria's FIBO models using Visual Generation Language (VGL).

> **Related Skill**: Use **[bria-ai](../bria-ai/SKILL.md)** to execute these VGL prompts via the Bria API. VGL defines the structured prompt format; bria-ai handles generation, editing, and background removal.

## Core Concept

VGL replaces ambiguous natural language prompts with deterministic JSON that explicitly declares every visual attribute: objects, lighting, camera settings, composition, and style. This ensures reproducible, controllable image generation.

## Operation Modes

| Mode | Input | Output | Use Case |
|------|-------|--------|----------|
| **Generate** | Text prompt | VGL JSON | Create new image from description |
| **Edit** | Image + instruction | VGL JSON | Modify reference image |
| **Edit_with_Mask** | Masked image + instruction | VGL JSON | Fill grey masked regions |
| **Caption** | Image only | VGL JSON | Describe existing image |
| **Refine** | Existing JSON + edit | Updated VGL JSON | Modify existing prompt |

## JSON Schema

Output a single valid JSON object with these required keys:

### 1. `short_description` (String)
Concise summary of image content, max 200 words. Include key subjects, actions, setting, and mood.

### 2. `objects` (Array, max 5 items)
Each object requires:

```json
{
  "description": "Detailed description, max 100 words",
  "location": "center | top-left | bottom-right foreground | etc.",
  "relative_size": "small | medium | large within frame",
  "shape_and_color": "Basic shape and dominant color",
  "texture": "smooth | rough | metallic | furry | fabric | etc.",
  "appearance_details": "Notable visual details",
  "relationship": "Relationship to other objects",
  "orientation": "upright | tilted 45 degrees | facing left | horizontal | etc."
}
```

**Human subjects** add:
```json
{
  "pose": "Body position description",
  "expression": "winking | joyful | serious | surprised | calm",
  "clothing": "Attire description",
  "action": "What the person is doing",
  "gender": "Gender description",
  "skin_tone_and_texture": "Skin appearance"
}
```

**Object clusters** add:
```json
{
  "number_of_objects": 3
}
```

**Size guidance**: If a person is the main subject, use `"medium-to-large"` or `"large within frame"`.

### 3. `background_setting` (String)
Overall environment, setting, and background elements not in `objects`.

### 4. `lighting` (Object)
```json
{
  "conditions": "bright daylight | dim indoor | studio lighting | golden hour | blue hour | overcast",
  "direction": "front-lit | backlit | side-lit from left | top-down",
  "shadows": "long, soft shadows | sharp, defined shadows | minimal shadows"
}
```

### 5. `aesthetics` (Object)
```json
{
  "composition": "rule of thirds | symmetrical | centered | leading lines | medium shot | close-up",
  "color_scheme": "monochromatic blue | warm complementary | high contrast | pastel",
  "mood_atmosphere": "serene | energetic | mysterious | joyful | dramatic | peaceful"
}
```
For people as main subject, specify shot type in composition: `"medium shot"`, `"close-up"`, `"portrait composition"`.

### 6. `photographic_characteristics` (Object)
```json
{
  "depth_of_field": "shallow | deep | bokeh background",
  "focus": "sharp focus on subject | soft focus | motion blur",
  "camera_angle": "eye-level | low angle | high angle | dutch angle | bird's-eye",
  "lens_focal_length": "wide-angle | 50mm standard | 85mm portrait | telephoto | macro"
}
```
**For people**: Prefer `"standard lens (35mm-50mm)"` or `"portrait lens (50mm-85mm)"`. Avoid wide-angle unless specified.

### 7. `style_medium` (String)
`"photograph"` | `"oil painting"` | `"watercolor"` | `"3D render"` | `"digital illustration"` | `"pencil sketch"`

Default to `"photograph"` unless explicitly requested otherwise.

### 8. `artistic_style` (String)
If not photograph, describe characteristics in max 3 words: `"impressionistic, vibrant, textured"`

For photographs, use `"realistic"` or similar.

### 9. `context` (String)
Describe the image type/purpose:
- `"High-fashion editorial photograph for magazine spread"`
- `"Concept art for fantasy video game"`
- `"Commercial product photography for e-commerce"`

### 10. `text_render` (Array)
**Default: empty array `[]`**

Only populate if user explicitly provides exact text content:
```json
{
  "text": "Exact text from user (never placeholder)",
  "location": "center | top-left | bottom",
  "size": "small | medium | large",
  "color": "white | red | blue",
  "font": "serif typeface | sans-serif | handwritten | bold impact",
  "appearance_details": "Metallic finish | 3D effect | etc."
}
```
Exception: Universal text integral to objects (e.g., "STOP" on stop sign).

### 11. `edit_instruction` (String)
Single imperative command describing the edit/generation.

## Edit Instruction Formats

### For Standard Edits (no mask)
Start with action verb, describe changes, never reference "original image":

| Category | Rewritten Instruction |
|----------|----------------------|
| Style change | `Turn the image into the cartoon style.` |
| Object attribute | `Change the dog's color to black and white.` |
| Add element | `Add a wide-brimmed felt hat to the subject.` |
| Remove object | `Remove the book from the subject's hands.` |
| Replace object | `Change the rose to a bright yellow sunflower.` |
| Lighting | `Change the lighting from dark and moody to bright and vibrant.` |
| Composition | `Change the perspective to a wider shot.` |
| Text change | `Change the text "Happy Anniversary" to "Hello".` |
| Quality | `Refine the image to obtain increased clarity and sharpness.` |

### For Masked Region Edits
Reference "masked regions" or "masked area" as target:

| Intent | Rewritten Instruction |
|--------|----------------------|
| Object generation | `Generate a white rose with a blue center in the masked region.` |
| Extension | `Extend the image into the masked region to create a scene featuring...` |
| Background fill | `Create the following background in the masked region: A vast ocean extending to horizon.` |
| Atmospheric fill | `Fill the background masked area with a clear, bright blue sky with wispy clouds.` |
| Subject restoration | `Restore the area in the mask with a young woman.` |
| Environment infill | `Create inside the masked area: a greenhouse with rows of plants under glass ceiling.` |

## Fidelity Rules

### Standard Edit Mode
Preserve ALL visual properties unless explicitly changed by instruction:
- Subject identity, pose, appearance
- Object existence, location, size, orientation
- Composition, camera angle, lens characteristics
- Style/medium

Only change what the edit strictly requires.

### Masked Edit Mode
- Preserve all visible (non-masked) portions exactly
- Fill grey masked regions to blend seamlessly with unmasked areas
- Match existing style, lighting, and subject matter
- Never describe grey masks—describe content that fills them

## Example Output

```json
{
  "short_description": "A professional businesswoman in a navy blazer stands confidently in a modern glass office, holding a tablet. Natural daylight streams through floor-to-ceiling windows, creating a warm, productive atmosphere.",
  "objects": [
    {
      "description": "A confident businesswoman in her 30s with shoulder-length dark hair, wearing a tailored navy blazer over a white blouse. She holds a tablet in her left hand while gesturing naturally with her right.",
      "location": "center-right",
      "relative_size": "large within frame",
      "shape_and_color": "Human figure, navy and white clothing",
      "texture": "smooth fabric, professional attire",
      "appearance_details": "Minimal jewelry, well-groomed professional appearance",
      "relationship": "Main subject, interacting with tablet",
      "orientation": "facing slightly left, three-quarter view",
      "pose": "Standing upright, relaxed professional stance",
      "expression": "confident, approachable smile",
      "clothing": "Tailored navy blazer, white silk blouse, dark trousers",
      "action": "Presenting or reviewing information on tablet",
      "gender": "female",
      "skin_tone_and_texture": "Medium warm skin tone, healthy smooth complexion"
    },
    {
      "description": "A modern tablet device with a bright display showing charts and graphs",
      "location": "center, held by subject",
      "relative_size": "small",
      "shape_and_color": "Rectangular, silver frame with illuminated screen",
      "texture": "smooth glass and metal",
      "appearance_details": "Thin profile, business application visible on screen",
      "relationship": "Held by businesswoman, focus of her attention",
      "orientation": "vertical, screen facing viewer at slight angle",
      "pose": null,
      "expression": null,
      "clothing": null,
      "action": null,
      "gender": null,
      "skin_tone_and_texture": null,
      "number_of_objects": null
    }
  ],
  "background_setting": "Modern corporate office interior with floor-to-ceiling windows overlooking a city skyline. Minimalist furniture in neutral tones, potted plants adding touches of green.",
  "lighting": {
    "conditions": "bright natural daylight",
    "direction": "side-lit from left through windows",
    "shadows": "soft, natural shadows"
  },
  "aesthetics": {
    "composition": "rule of thirds, medi

... (truncated)