Sample output
30-second clip generated withlyria-3-clip-preview (Google Lyria). Prompt: “Uplifting modern electronic track with warm synths, gentle beat, optimistic mood.”
Pick a model
| Model | Strong for | Duration | Notes |
|---|---|---|---|
lyria-3-clip (Google) | Instrumentals, atmosphere, demos | Fixed 30 s | English-only prompts; fast; no vocals |
lyria-3-pro (Google) | Longer, higher-fidelity instrumentals | Up to ~2 min | English-only; richer mix |
suno-v4 | Full songs with vocals, any genre | 30 s – 5 min | Lyrics + singing supported |
suno-v5 | Latest Suno — better vocals, sound effects | 30 s – 5 min | Supports sounds operation |
GET /v1/models?modality=music. Each model exposes allowedParams.response_formats, allowedParams.operations, and allowedParams.default_params.
Rule of thumb:
- Background / UI / ad beds → Lyria (instrumental, cheap, fast).
- Songs with lyrics, creator content → Suno.
- Sound effects, short stingers → Suno V5
soundsoperation.
Prompting that renders
Music models are literal about genre, instruments and mood — but imprecise about tempo and key unless forced.- Say “no vocals” explicitly when you want an instrumental — Suno defaults to vocals.
- Reference era + region for style anchoring (”90s British trip-hop”, “early-2000s West Coast hip-hop”).
- Describe production, not just genre — “lo-fi tape warmth, sidechained pad, shuffled hi-hat” lands better than “chill beat”.
- Specify BPM when it matters (ads, workouts). Models don’t always honour it, but without you get unpredictable tempo.
- Avoid copyrighted artist names. Use stylistic descriptors instead (“operatic rock anthem in the vein of stadium classics”, not “in the style of Queen”).
Async like video
Music generation is asynchronous, same submit → poll pattern as video:python
- Lyria clip (30 s): ~20–40 s
- Lyria pro (2 min): ~60–120 s
- Suno full song (3 min): ~60–90 s
Suno custom mode
By default Suno writes the lyrics for you from your prompt. For full control, enablecustom_mode:
python
custom_mode: true— unlockstitle,style,lyricsstyle— genre/style string (max 1000 chars)lyrics— bring your own lyrics (max ~5000 chars), use[Verse]/[Chorus]/[Bridge]section tags for structureinstrumental: true— generate without vocals even with lyrics suppliedvocal_gender—morfnegative_tags— styles to steer away fromstyle_weight,weirdness_constraint,audio_weight— 0.0–1.0 dialspersona_id+persona_model— reuse a stylistic or vocal persona
Operation modes (Suno)
Suno supports several operations beyond plain generation, dispatched via theoperation field:
| Operation | What it does | Required fields |
|---|---|---|
generate (default) | Text-to-music | prompt |
extend | Continue an existing track | audio_id, optional continue_at |
upload_cover | Cover song from uploaded audio | upload_url |
upload_extend | Extend uploaded audio | upload_url |
add_instrumental | Make instrumental version | upload_url, tags |
add_vocals | Add vocals to instrumental | upload_url |
vocal_removal | Separate vocals / stems | audio_id, task_id, separation_type |
sounds (V5) | Sound effect with BPM/key/loop | prompt, sound_loop, sound_tempo, sound_key |
lyrics | Generate lyrics only (no audio) | prompt |
vocal_removal) are returned as additional items in result.data.
Persistence
result.data[].url is a signed URL into our private storage with a 7-day expiry. For long-term keeping:
python
b64_audio for that track — decode and save yourself:
python
Costs at a glance
Per 30-second clip / per 1-minute song:- Lyria 3 clip: ~10 credits per 30 s
- Lyria 3 pro: ~25 credits per minute
- Suno v4: ~30 credits per minute
- Suno v5: ~40 credits per minute
- Vocal removal (
separate_vocal): ~10 credits;split_stem(up to 12 stems): ~30 credits
Formats
- Default:
mp3(~128–192 kbps, universally supported, small enough to stream). wavis supported on Lyria 3 Pro and all Suno models — use for post-production editing only, files are ~10× larger.
Pitfalls
- Non-English prompts to Lyria → 400 with a clear message. Translate or use Suno instead.
- Copyrighted artists/songs — refusal from upstream; not retryable by fallback. Describe the sound, not the artist.
- Suno
instrumental: true+lyricsprovided — lyrics are ignored, the track is instrumental. Don’t pay for generation that ignores a key input; set one or the other. - Polling too fast — stick to 5-second intervals. Faster polling won’t speed up generation, it’ll just eat your RPM.
- Storing 5-minute tracks as base64 in JSON — the gateway already offloads to GCS and returns
url; always preferurloverb64_audiowhen both are present.

