The shape
The loop
Tool calling is a loop: model proposes call → you execute → you append the result → model continues.python
tool_calls and append one tool message per tool_call_id.
Forcing or banning tools
tool_choice controls the model’s freedom:
required for deterministic agents (e.g. “always plan before answering”); use none when you want a plain text reply mid-conversation.
Streaming tool calls
Whenstream: true, tool call arguments arrive as incremental string fragments keyed by tool_calls[i].index. Concatenate per index:
node
Schema tips
- Mark every truly-required arg in
required. Models honour it. - Keep
descriptionshort and action-oriented (“Look up customer by email”) — it shows up in the model’s reasoning budget. - Prefer
enumover free strings for finite domains. It cuts hallucinated values. - Nested objects work, but flatter is faster: less context, fewer formatting errors.
- Use
additionalProperties: falseif you want strict mode (with"strict": trueon response_format) to refuse extra keys.
Cross-provider quirks
We normalise the surface, but a few sharp edges leak through:- Anthropic models charge for the whole tool schema in input tokens — keep schemas small.
- Google Gemini sometimes returns
argumentsas an already-parsed object instead of a JSON string; parse defensively (typeof === 'string' ? JSON.parse(x) : x). - OSS / smaller models may invent tool names. Validate
function.nameagainst your registry before invoking.
Errors
If your handler throws, return the error as the tool result — don’t crash the loop:python

