From: ExtReMLapin Date: Fri, 29 Aug 2025 17:25:40 +0000 (+0200) Subject: server : add documentation for `parallel_tool_calls` param (#15647) X-Git-Tag: upstream/0.0.6527~208 X-Git-Url: https://git.djapps.eu/?a=commitdiff_plain;h=792b44f2ed9668cce7f267ff0ae4950ed9b4a5de;p=pkg%2Fggml%2Fsources%2Fllama.cpp server : add documentation for `parallel_tool_calls` param (#15647) Co-authored-by: Pierre F --- diff --git a/docs/function-calling.md b/docs/function-calling.md index 37eacaf3..67cf785c 100644 --- a/docs/function-calling.md +++ b/docs/function-calling.md @@ -21,6 +21,8 @@ Function calling is supported for all models (see https://github.com/ggml-org/ll - Use `--chat-template-file` to override the template when appropriate (see examples below) - Generic support may consume more tokens and be less efficient than a model's native format. +- Multiple/parallel tool calling is supported on some models but disabled by default, enable it by passing `"parallel_tool_calls": true` in the completion endpoint payload. +
Show some common templates and which format handler they use diff --git a/tools/server/README.md b/tools/server/README.md index baf3730a..6962b0d3 100644 --- a/tools/server/README.md +++ b/tools/server/README.md @@ -1143,6 +1143,8 @@ The `response_format` parameter supports both plain JSON output (e.g. `{"type": `parse_tool_calls`: Whether to parse the generated tool call. +`parallel_tool_calls` : Whether to enable parallel/multiple tool calls (only supported on some models, verification is based on jinja template). + *Examples:* You can use either Python `openai` library with appropriate checkpoints: