Pick a model
| Model | Strong for | Notes |
|---|---|---|
dall-e-3 | Following long prompts literally | OpenAI revises your prompt — pass "prompt_revision": false to keep it raw |
gemini-2-5-flash-image (Nano Banana) | Photorealism, edits, fast | Cheap; supports image-to-image edits |
imagen-4 | High aesthetic quality | Slower; great for marketing visuals |
qwen-image | Multilingual prompts (CN/EN), text in image | PRC opt-in required |
flux-1-pro | Open-source FLUX, uncensored prompts | Self-hosted, no policy filter |
grok-image | Realistic + low refusal rate | xAI |
GET /v1/models?modality=image. Each model exposes allowedParams.supported_sizes so you know what to ask for.
Prompt patterns that work
Across all current image models, structured prompts beat freeform descriptions:- Subject first. Models weight early tokens more.
- Be concrete. “Cinematic lighting” is vague; “low-key chiaroscuro from a single window” isn’t.
- Specify what to avoid with negative phrasing: “no text, no watermark, no humans”.
- Anchor style with reference styles or photographers (“in the style of National Geographic”, “shot on Hasselblad H6D”).
- Iterate small. Change one variable at a time — model, then prompt, then size.
Sizes and aspect ratios
Common, almost universally supported:| Use | Size | Aspect |
|---|---|---|
| Square thumbnail | 1024x1024 | 1:1 |
| Hero / blog | 1792x1024 | 16:9 |
| Portrait / mobile | 1024x1792 | 9:16 |
| Square small / icon | 512x512 | 1:1 |
GET /v1/models/{slug} before assuming — some models only support 1024x1024.
Persistence
Generated URLs are ephemeral — typically 1 hour. If you want to keep an image, do one of:python
python
file_id.
Image edits
Models with edit support (Nano Banana, Qwen Image, FLUX) take a source image plus a prompt:python
Sample: generate vs. edit


gemini-2.5-flash-image (Nano Banana). Notice how the edit preserves the composition — bridge, lantern, pond — while swapping season, lighting and palette.
Quality vs. cost
Generation cost varies by model and size — check pricing inGET /v1/models. Rules of thumb:
- DALL·E 3 standard ≈ 4 credits, HD ≈ 8 credits
- Imagen 4 ≈ 8 credits
- Nano Banana ≈ 1.5 credits
- FLUX (self-hosted) ≈ 1 credit

