Skip to main content
POST
/
v1
/
music
/
generations
Generate music from text prompt
curl --request POST \
  --url https://api.infery.ai/v1/music/generations \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "prompt": "<string>",
  "operation": "generate",
  "images": [
    {
      "data": "<string>",
      "mime_type": "image/jpeg"
    }
  ],
  "response_format": "mp3",
  "custom_mode": true,
  "instrumental": true,
  "title": "<string>",
  "style": "<string>",
  "lyrics": "<string>",
  "negative_tags": "<string>",
  "vocal_gender": "m",
  "style_weight": 0.5,
  "weirdness_constraint": 0.5,
  "audio_weight": 0.5,
  "persona_id": "<string>",
  "persona_model": "style_persona",
  "audio_id": "<string>",
  "task_id": "<string>",
  "upload_url": "<string>",
  "continue_at": 123,
  "default_param_flag": true,
  "separation_type": "separate_vocal",
  "tags": "<string>",
  "sound_loop": true,
  "sound_tempo": 150,
  "sound_key": "<string>"
}
'
{
  "created": 1713204900,
  "data": [
    {
      "url": "https://storage.googleapis.com/.../music/...mp3?X-Goog-Signature=...",
      "b64_audio": "<string>",
      "content_type": "audio/mpeg",
      "duration_seconds": 30,
      "lyrics": "<string>"
    }
  ],
  "credits_used": 180
}
curl https://api.infery.ai/v1/music/generations \
  -H "Authorization: Bearer $INFERY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "suno-v4",
    "prompt": "Upbeat indie folk about a sunrise coastal drive",
    "duration_seconds": 60
  }'
Music generation is async — returns job_id, poll GET /v1/music/generations/:job_id for progress. When complete you get a URL to the MP3 and separate stems if the model exposes them (Suno supports vocals + instrumentals).

Sample output

30-second clip generated with lyria-3-clip-preview (Google Lyria).

Parameters

  • prompt — natural-language description
  • style — optional style tag (lofi, synthwave, indie, etc.)
  • lyrics — optional explicit lyrics
  • vocal_gendermale, female, none
  • duration_seconds — 30–180 (model-dependent)

Supported models

Suno (primary). Google Lyria when in beta. See Models → Music.

Authorizations

Authorization
string
header
required

API key in format: Bearer inf_***

Body

application/json
model
string
required

Model ID to use for music generation

prompt
string
required

Text prompt describing the music. For Lyria 3 Clip, always produces 30s. For Lyria 3 Pro, control duration via prompt or timestamps. For Suno, max 500 chars in non-custom mode, up to 5000 in custom mode.

operation
enum<string>
default:generate

Suno operation type. Default = generate. Other operations require additional fields (audio_id, upload_url, etc).

Available options:
generate,
extend,
upload_cover,
upload_extend,
add_instrumental,
add_vocals,
sounds,
vocal_removal,
lyrics
images
object[]

Up to 10 base64-encoded images to inspire the music (Lyria 3 only)

response_format
enum<string>
default:mp3

Output format. WAV only supported by Lyria 3 Pro and Suno.

Available options:
mp3,
wav
custom_mode
boolean

Suno: Enable custom mode (full control over style/title/lyrics)

instrumental
boolean

Suno: Generate instrumental track only (no vocals)

title
string

Suno: Track title (custom mode, max 100 chars)

style
string

Suno: Music genre/style (custom mode, max 1000 chars)

lyrics
string

Suno: Lyrics text (custom mode, when not instrumental)

negative_tags
string

Suno: Styles to exclude (e.g. "Heavy Metal, Upbeat Drums")

vocal_gender
enum<string>

Suno: Preferred vocal gender

Available options:
m,
f
style_weight
number

Suno: Style adherence weight (0.0-1.0)

Required range: 0 <= x <= 1
weirdness_constraint
number

Suno: Creativity/novelty constraint (0.0-1.0)

Required range: 0 <= x <= 1
audio_weight
number

Suno: Input audio influence weight (0.0-1.0)

Required range: 0 <= x <= 1
persona_id
string

Suno: Persona ID to apply (custom mode)

persona_model
enum<string>

Suno: Persona model type

Available options:
style_persona,
voice_persona
audio_id
string

Suno: Source audio ID (for extend, vocal_removal)

task_id
string

Suno: Task ID (vocal_removal — references original generation task)

upload_url
string

Suno: Audio file URL (for upload_cover, upload_extend, add_instrumental, add_vocals)

continue_at
number

Suno: Continue from this second mark (extend operations)

default_param_flag
boolean

Suno: Use default params (extend operations)

separation_type
enum<string>

Suno: Vocal removal type (2 stems vs up to 12 stems)

Available options:
separate_vocal,
split_stem
tags
string

Suno: Tags for add_instrumental operation

sound_loop
boolean

Suno sounds: Loop the generated sound

sound_tempo
integer

Suno sounds: BPM (1-300)

Required range: 1 <= x <= 300
sound_key
string

Suno sounds: Musical key

Response

Music generation result. Audio is uploaded to private storage and returned as a signed url (7-day expiry). If the storage upload fails, the response falls back to inline b64_audio.

created
integer

Unix timestamp (seconds) when the generation completed

Example:

1713204900

data
object[]
credits_used
integer

Credits deducted from the workspace balance for this request

Example:

180