file_id.
When to inline vs. upload
| You… | Use |
|---|---|
| Send a one-off image/PDF you already have at a URL | Inline image_url / file (data URI) |
| Re-use the same document across many calls | Upload once, reference the file_id |
| Need workspace-wide access (multiple keys, members) | Upload — files are workspace-scoped |
| Care about idempotency on retries | Upload + Idempotency-Key |
| Pass >20 MB | Upload — gateway request body cap is 20 MB |
Inline content blocks
messages[i].content can be an array of typed blocks:
Image (URL or base64)
detail (optional): "low" (faster, cheaper, ~85 tokens) or "high" (default for most models).
Audio (inline base64 only)
format: wav, mp3, pcm16, webm. The model must support audio input — check supportsAudioInput on GET /v1/models.
File — inline
File — by file_id
PDFs
Models withsupportsPdf: true (Anthropic Claude, Google Gemini, OpenAI gpt-4o) read PDFs natively. For others, the gateway transparently converts each page to an image and prepends the extracted text — you don’t change a thing, you just see a small pdf_processing line item on the next invoice.
Vision
Models withsupportsVision: true accept arbitrary images. URL fetches happen on the gateway with a 10-second timeout — if your URL is slow or behind auth, prefer base64 or upload.
Quick recipes
Multi-image diff
python
Reusable contract
python
file_id is referenced from many calls; you upload once.
Limits
- Per-call inline payload: 20 MB (sum of all base64 blocks)
- Per-file upload: plan-based (see Plans)
- Image dimensions: rescaled by the provider — no need to pre-resize
- PDF pages: practical cap ~100 (model context window limits dominate)

