Agent Skill · OpenAI

imagegen

Generate or edit raster images when the task benefits from AI-created bitmap visuals such as photos, illustrations, textures, sprites, mockups, or transparent-background cutouts. Use when Codex should create a brand-new image, transform an existing image, or derive visual variants from references, and the output should be a bitmap asset rather than repo-native code or vector. Do not use when the task is better handled by editing existing SVG/vector/code-native assets, extending an established icon or logo system, or building the visual directly in HTML/CSS/canvas.

Provider: OpenAI Path in repo: skills/.system/imagegen/SKILL.md

Skill body

Image Generation Skill

Generates or edits images for the current project (for example website assets, game assets, UI mockups, product mockups, wireframes, logo design, photorealistic images, or infographics).

Top-level modes and rules

This skill has exactly two top-level modes:

Within the explicit CLI fallback only, the CLI exposes three subcommands:

Rules:

Built-in save-path policy:

Shared prompt guidance for both modes lives in references/prompting.md and references/sample-prompts.md.

Fallback-only docs/resources for CLI mode:

When to use

When not to use

Decision tree

Think about two separate questions:

  1. Intent: is this a new image or an edit of an existing image?
  2. Execution strategy: is this one asset or many assets/variants?

Intent:

Built-in edit semantics:

Execution strategy:

Assume the user wants a new image unless they clearly ask to change an existing one.

Workflow

  1. Decide the top-level mode: built-in by default, fallback CLI only if explicitly requested.
  2. Decide the intent: generate or edit.
  3. Decide whether the output is preview-only or meant to be consumed by the current project.
  4. Decide the execution strategy: single asset vs repeated built-in calls vs CLI generate-batch.
  5. Collect inputs up front: prompt(s), exact text (verbatim), constraints/avoid list, and any input images.
  6. For every input image, label its role explicitly:
    • reference image
    • edit target
    • supporting insert/style/compositing input
  7. If the edit target is only on the local filesystem and you are staying on the built-in path, inspect it with view_image first so the image is available in conversation context.
  8. If the user asked for a photo, illustration, sprite, product image, banner, or other explicitly raster-style asset, use image_gen rather than substituting SVG/HTML/CSS placeholders. If the request is for an icon, logo, or UI graphic that should match existing repo-native SVG/vector/code assets, prefer editing those directly instead.
  9. Augment the prompt based on specificity:
    • If the user’s prompt is already specific and detailed, normalize it into a clear spec without adding creative requirements.
    • If the user’s prompt is generic, add tasteful augmentation only when it materially improves output quality.
  10. Use the built-in image_gen tool by default.
  11. If the user explicitly chooses the CLI fallback, then and only then use the fallback-only docs for quality, input_fidelity, masks, output format, output paths, and network setup.
  12. Inspect outputs and validate: subject, style, composition, text accuracy, and invariants/avoid items.
  13. Iterate with a single targeted change, then re-check.
  14. For preview-only work, render the image inline; the underlying file may remain at the default $CODEX_HOME/generated_images/... path.
  15. For project-bound work, move or copy the selected artifact into the workspace and update any consuming code or references. Never leave a project-referenced asset only at the default $CODEX_HOME/generated_images/... path.
  16. For batches, persist only the selected finals in the workspace unless the user explicitly asked to keep discarded variants.
  17. Always report the final saved path for any workspace-bound asset, plus the final prompt and whether the built-in tool or fallback CLI mode was used.

Prompt augmentation

Reformat user prompts into a structured, production-oriented spec. Make the user’s goal clearer and more actionable, but do not blindly add detail.

Treat this as prompt-shaping guidance, not a closed schema. Use only the lines that help, and add a short extra labeled line when it materially improves clarity.

Specificity policy

Use the user’s prompt specificity to decide how much augmentation is appropriate:

Allowed augmentations:

Not allowed augmentations:

Use-case taxonomy (exact slugs)

Classify each request into one of these buckets and keep the slug consistent across prompts and references.

Generate:

Edit:

Shared prompt schema

Use the following labeled spec as shared prompt scaffolding for both top-level modes:

Use case: <taxonomy slug>
Asset type: <where the asset will be used>
Primary request: <user's main prompt>
Input images: <Image 1: role; Image 2: role> (optional)
Scene/backdrop: <environment>
Subject: <main subject>
Style/medium: <photo/illustration/3D/etc>
Composition/framing: <wide/close/top-down; placement>
Lighting/mood: <lighting + mood>
Color palette: <palette notes>
Materials/textures: <surface details>
Text (verbatim): "<exact text>"
Constraints: <must keep/must avoid>
Avoid: <negative constraints>

Notes:

Augmentation rules:

Examples

Generation example (hero image)

Use case: product-mockup
Asset type: landing page hero
Primary request: a minimal hero image of a ceramic coffee mug
Style/medium: clean product photography
Composition/framing: wide composition with usable negative space for page copy if needed
Lighting/mood: soft studio lighting
Constraints: no logos, no text, no watermark

Edit example (invariants)

Use case: precise-object-edit
Asset type: product photo background replacement
Primary request: replace only the background with a warm sunset gradient
Constraints: change only the background; keep the product and its edges unchanged; no text; no watermark

Prompting best practices

More principles shared by both modes: references/prompting.md. Copy/paste specs shared by both modes: references/sample-prompts.md.

Guidance by asset type

Asset-type templates (website assets, game assets, wireframes, logo) are consolidated in references/sample-prompts.md.

Fallback CLI mode only

Temp and output conventions

These conventions apply only to the explicit CLI fallback. They do not describe built-in image_gen output behavior.

Dependencies

Prefer uv for dependency management in this repo.

Required Python package:

uv pip install openai

Optional for downscaling only:

uv pip install pillow

Portability note:

Environment

If the key is missing, give the user these steps:

  1. Create an API key in the OpenAI platform UI: https://platform.openai.com/api-keys
  2. Set OPENAI_API_KEY as an environment variable in their system.
  3. Offer to guide them through setting the environment variable for their OS/shell if needed.

If installation is not possible in this environment, tell the user which dependency is missing and how to install it into their active environment.

Script-mode notes

Reference map