Music

curl --request POST \ --url https://api.infery.ai/v1/music/generations \ --header 'Authorization: <api-key>' \ --header 'Content-Type: application/json' \ --data ' { "model": "<string>", "prompt": "<string>", "operation": "generate", "images": [ { "data": "<string>", "mime_type": "image/jpeg" } ], "response_format": "mp3", "custom_mode": true, "instrumental": true, "title": "<string>", "style": "<string>", "lyrics": "<string>", "negative_tags": "<string>", "vocal_gender": "m", "style_weight": 0.5, "weirdness_constraint": 0.5, "audio_weight": 0.5, "persona_id": "<string>", "persona_model": "style_persona", "audio_id": "<string>", "task_id": "<string>", "upload_url": "<string>", "continue_at": 123, "default_param_flag": true, "separation_type": "separate_vocal", "tags": "<string>", "sound_loop": true, "sound_tempo": 150, "sound_key": "<string>" } '

{ "created": 1713204900, "data": [ { "url": "https://storage.googleapis.com/.../music/...mp3?X-Goog-Signature=...", "b64_audio": "<string>", "content_type": "audio/mpeg", "duration_seconds": 30, "lyrics": "<string>" } ], "credits_used": 180 }

curl https://api.infery.ai/v1/music/generations \ -H "Authorization: Bearer $INFERY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "suno-v4", "prompt": "Upbeat indie folk about a sunrise coastal drive", "duration_seconds": 60 }'

Supported models

Suno (primary). Google Lyria when in beta. See Models → Music.

Authorizations

Authorization

string

header

required

API key in format: Bearer inf_***

Body

application/json

model

string

required

Model ID to use for music generation

prompt

string

required

Text prompt describing the music. For Lyria 3 Clip, always produces 30s. For Lyria 3 Pro, control duration via prompt or timestamps. For Suno, max 500 chars in non-custom mode, up to 5000 in custom mode.

operation

enum<string>

default:generate

Suno operation type. Default = generate. Other operations require additional fields (audio_id, upload_url, etc).

Available options:

generate,

extend,

upload_cover,

upload_extend,

add_instrumental,

add_vocals,

sounds,

vocal_removal,

lyrics

images

object[]

Up to 10 base64-encoded images to inspire the music (Lyria 3 only)

Show child attributes

response_format

enum<string>

default:mp3

Output format. WAV only supported by Lyria 3 Pro and Suno.

Available options:

mp3,

wav

custom_mode

boolean

Suno: Enable custom mode (full control over style/title/lyrics)

instrumental

boolean

Suno: Generate instrumental track only (no vocals)

title

string

Suno: Track title (custom mode, max 100 chars)

style

string

Suno: Music genre/style (custom mode, max 1000 chars)

lyrics

string

Suno: Lyrics text (custom mode, when not instrumental)

negative_tags

string

Suno: Styles to exclude (e.g. "Heavy Metal, Upbeat Drums")

vocal_gender

enum<string>

Suno: Preferred vocal gender

Available options:

m,

f

style_weight

number

Suno: Style adherence weight (0.0-1.0)

Required range: 0 <= x <= 1

weirdness_constraint

number

Suno: Creativity/novelty constraint (0.0-1.0)

Required range: 0 <= x <= 1

audio_weight

number

Suno: Input audio influence weight (0.0-1.0)

Required range: 0 <= x <= 1

persona_id

string

Suno: Persona ID to apply (custom mode)

persona_model

enum<string>

Suno: Persona model type

Available options:

style_persona,

voice_persona

audio_id

string

Suno: Source audio ID (for extend, vocal_removal)

task_id

string

Suno: Task ID (vocal_removal — references original generation task)

upload_url

string

Suno: Audio file URL (for upload_cover, upload_extend, add_instrumental, add_vocals)

continue_at

number

Suno: Continue from this second mark (extend operations)

default_param_flag

boolean

Suno: Use default params (extend operations)

separation_type

enum<string>

Suno: Vocal removal type (2 stems vs up to 12 stems)

Available options:

separate_vocal,

split_stem

Response

Music generation result. Audio is uploaded to private storage and returned as a signed url (7-day expiry). If the storage upload fails, the response falls back to inline b64_audio.

created

integer

Unix timestamp (seconds) when the generation completed

Example:

1713204900

data

object[]

Show child attributes

credits_used

integer

Credits deducted from the workspace balance for this request

Example:

180

Overview

Chat Completions

Embeddings

Images

Audio

Video

Files

Models

Music

Sample output

Parameters

Supported models

Authorizations

Body

Response

Overview

Chat Completions

Embeddings

Images

Audio

Video

Music

Files

Models

​Sample output

​Parameters

​Supported models

Authorizations

Body

Response

Sample output

Parameters

Supported models