Text generation (OpenAI-compatible API)
Call the model through an OpenAI-compatible Chat API.
Authorization
- Security Scheme Type: http
- HTTP Authorization Scheme: Bearer API_key, used to verify account information and can be viewed in Project Management > API Key.
Request Header
The media type of the request body. Set it to application/json to ensure the request data is JSON.
Available option: application/json
Request Bodyapplication/json
modelstringRequired
Model code. Available values: u2, u1-insuremed.
messagesarrayRequired
Context passed to the model, ordered by conversation sequence.
System MessageobjectOptional
System message used to set the model role, tone, task objective, or constraint. It is usually placed at the first position of the messages array.
contentstring | arrayRequired
System instruction used to define the model role, behavior rules, response style, and task constraints.
typestringRequired
Content type. Only the fixed value text is supported.
textstringRequired
Specific text content.
rolestringRequired
Role of the system message, fixed to system.
User MessageobjectRequired
User message used to pass questions, instructions, or context to the model.
contentstring | arrayRequired
Message content.
typestringRequired
Content type. Only the fixed value text is supported.
textstringRequired
Specific text content.
rolestringRequired
Role of the user message, fixed to user.
Assistant MessageobjectOptional
Reply from the model. It is usually sent back to the model as context in multi-turn conversations.
contentstring | arrayOptional
Text content of the model reply. When tool_calls is present, content can be empty; otherwise content is required.
typestringRequired
Content type. Only the fixed value text is supported.
textstringRequired
Specific text content.
reasoning_contentstringOptional
Reasoning chain content of the model.
rolestringRequired
Role of the assistant message, fixed to assistant.
tool_callsarrayOptional
Tool and argument information returned after Function Calling, containing one or more objects. It comes from the tool_calls field of the previous model response.
idstringRequired
ID of the tool response.
typestringRequired
Tool type. Currently only function is supported.
functionobjectRequired
Tool and argument information.
namestringRequired
Tool name.
argumentsstringRequired
Arguments in JSON string format.
indexintegerRequired
Index of the current tool information in the tool_calls array.
Tool MessageobjectOptional
Output information from the tool.
contentstring | arrayRequired
Output content of the tool function. If it is structured data, serialize it to a string.
typestringRequired
Content type. Only the fixed value text is supported.
textstringRequired
Specific text content.
rolestringRequired
Fixed to tool.
tool_call_idstringRequired
ID returned after Function Calling, obtained through completion.choices[0].message.tool_calls[$index].id, used to identify which tool the Tool Message corresponds to.
streambooleanOptionalDefault: false
Whether to reply in streaming mode.
false: return once after the model generates all content;true: return incrementally while generating, and each partial result is returned as a data chunk. These chunks need to be read one by one in real time and concatenated into the full reply.
Setting it to true is recommended to improve reading experience and reduce timeout risk.
stream_optionsobjectOptional
Configuration items for streaming output. Effective only when stream is true.
include_usagebooleanOptionalDefault: false
Whether to include token usage information in the final data chunk of the response.
true: include it;false: do not include it.
temperaturefloatOptional
Sampling temperature, used to control output diversity. A higher temperature makes the generated text more diverse; a lower temperature makes it more deterministic.
Value range: [0,2).
Default value of temperature: 1.0.
top_kintegerOptional
Specifies the number of candidate tokens used during sampling. A larger value makes the output more random; a smaller value makes it more deterministic. The value must be an integer greater than or equal to 1.
Default value of top_k: 40.
extra_body object. Configuration example: extra_body={"top_k":xxx}.max_tokensintegerOptional
Used to limit the maximum number of output tokens. If the generated content exceeds this value, generation stops early and the returned finish_reason becomes length.
Useful when output length must be controlled, such as generating summaries or keywords, or for reducing cost and response time.
When max_tokens is triggered, the finish_reason field in the response becomes length.
max_tokens does not limit the length of the reasoning chain of thinking models.thinkingobjectOptional
Whether to enable thinking mode. u2 enables thinking by default and does not support disabling it.
typestringRequired
Available values: enabled (thinking mode on), disabled (thinking mode off).
extra_body object. Configuration example: extra_body={"thinking": "{"type":xxx}"}.toolsarrayOptional
Array containing one or more tool objects for the model to call in Function Calling.
If tools is set and the model decides a tool call is needed, the response returns tool information through tool_calls.
typestringRequired
Tool type. Currently only function is supported.
functionobjectRequired
namestringRequired
Tool name. Only letters, digits, underscores (_), and hyphens (-) are allowed, with a maximum of 64 tokens.
descriptionstringRequired
Tool description that helps the model decide when and how to call the tool.
parametersobjectOptionalDefault: {}
Parameter description of the tool. It must be valid JSON Schema.
parameters field is empty, the tool has no arguments, such as a time query tool. For better tool-calling accuracy, passing parameters is recommended.Chat response object (non-streaming)
idstring
Unique identifier for this request.
choicesarray
Array of model-generated content.
finish_reasonstring
Reason why the model stopped generating. There are three cases:
stopwhen the inputstopparameter is triggered or the model stops naturally;lengthwhen generation ends because the output is too long;tool_callswhen the model needs to call a tool.
indexinteger
Index of the current object in the choices array.
messageobject
Message output by the model.
contentstring
Reply content from the model.
reasoning_contentstring
Reasoning chain content of the model.
rolestring
Role of the message, fixed to assistant.
tool_callsarray
Tool and argument information generated by the model after Function Calling.
idstring
Unique identifier of this tool response.
typestring
Tool type. Currently only function is supported.
functionobject
Tool information.
namestring
Tool name.
argumentsstring
Arguments in JSON string format.
createdinteger
Unix timestamp in seconds when the request was created.
modelstring
Model used for this request.
objectstring
Always chat.completion.
service_tierstring
This field is currently fixed to null.
system_fingerprintstring
This field is currently fixed to null.
usageobject
Token usage information for this request.
completion_tokensinteger
Number of output tokens.
prompt_tokensinteger
Number of input tokens.
total_tokensinteger
Total number of consumed tokens, equal to prompt_tokens plus completion_tokens.
Chat response chunk object (streaming)
idstring
Unique identifier of this request. Every chunk object has the same id.
choicesarray
Array of model-generated content, which may contain one or more objects. If include_usage is set to true, choices is an empty array in the last chunk.
deltaobject
Incremental object of the request.
contentstring
Incremental message content.
reasoning_contentstring
Incremental reasoning chain content.
rolestring
Role of the incremental message object. It only has a value in the first chunk.
tool_callsarray
Tool and argument information generated by the model after Function Calling.
indexinteger
Index of the current tool in the tool_calls array.
idstring
Unique identifier of this tool response.
functionobject
Information about the called tool.
argumentsstring
Incremental argument information. After concatenating arguments across all chunks, you get the complete arguments.
namestring
Tool name, only present in the first chunk.
typestring
Tool type. Currently only function is supported.
finish_reasonstring
Reason why the model stopped generating. There are four cases:
stopwhen the inputstopparameter is triggered or the model stops naturally;nullwhile generation has not finished;lengthwhen generation ends because the output is too long;tool_callswhen the model needs to call a tool.
indexinteger
Index of the current response in the choices array. When the input parameter n is greater than 1, this value is required to merge the complete content of each different response.
createdinteger
Timestamp when this request was created. Each chunk has the same timestamp.
modelstring
Model used for this request.
objectstring
Always chat.completion.chunk.
service_tierstring
This field is currently fixed to null.
system_fingerprintstring
This field is currently fixed to null.
usageobject
Token usage of this request. It is shown only in the last chunk when include_usage is true.
completion_tokensinteger
Number of output tokens.
prompt_tokensinteger
Number of input tokens.
total_tokensinteger
Total number of tokens, equal to prompt_tokens plus completion_tokens.
