Perform chat completion inference | Elasticsearch API documentation (v8)

Path parameters

inference_idstring Required
The inference Id

Query parameters

timeoutstring
Specifies the amount of time to wait for the inference request to complete.
Values are -1 or 0.

application/json

BodyRequired

messagesarray[object] Required
A list of objects representing the conversation. Requests should generally only add new messages from the user (role user). The other message roles (assistant, system, or tool) should generally only be copied from the response to a previous completion request, such that the messages array is built up throughout a conversation.
Hide messages attributes Show messages attributes object
- contentstring | array[object]
  One of:
  MessageContentstring MessageContentarray[object]
  Hide attributes Show attributes object
  textstring Required
  The text content.
  typestring Required
  The type of content.
- rolestring Required
  The role of the message author.
- tool_call_idstring
- tool_callsarray[object]
  The tool calls generated by the model.
  Hide tool_calls attributes Show tool_calls attributes object
  idstring Required
  functionobject Required
  Hide function attributes Show function attributes object
  argumentsstring Required
  The arguments to call the function with in JSON format.
  namestring Required
  The name of the function to call.
  typestring Required
  The type of the tool call.
modelstring
The ID of the model to use.
max_completion_tokensnumber
The upper bound limit for the number of tokens that can be generated for a completion request.
stoparray[string]
A sequence of strings to control when the model should stop generating additional tokens.
temperaturenumber
The sampling temperature to use.
tool_choicestring | object
One of:
CompletionToolTypestring CompletionToolChoiceobject
Hide attributes Show attributes
typestring Required
The type of the tool.
functionobject Required
Hide function attribute Show function attribute object
namestring Required
The name of the function to call.
toolsarray[object]
A list of tools that the model can call.
Hide tools attributes Show tools attributes object
- typestring Required
  The type of tool.
- functionobject Required
  Hide function attributes Show function attributes object
  descriptionstring
  A description of what the function does. This is used by the model to choose when and how to call the function.
  namestring Required
  The name of the function.
  parametersobject
  The parameters the functional accepts. This should be formatted as a JSON object.
  strictboolean
  Whether to enable schema adherence when generating the function call.
top_pnumber
Nucleus sampling, an alternative to sampling with temperature.

Responses

200 application/json

POST /_inference/chat_completion/{inference_id}/_stream

curl \
 --request POST 'http://api.example.com/_inference/chat_completion/{inference_id}/_stream' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"model\": \"gpt-4o\",\n  \"messages\": [\n      {\n          \"role\": \"user\",\n          \"content\": \"What is Elastic?\"\n      }\n  ]\n}"'

Request examples

Run `POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion on the example question with .

{
  "model": "gpt-4o",
  "messages": [
      {
          "role": "user",
          "content": "What is Elastic?"
      }
  ]
}

Run `POST POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion using an Assistant message with `tool_calls`.

{
  "messages": [
      {
          "role": "assistant",
          "content": "Let's find out what the weather is",
          "tool_calls": [ 
              {
                  "id": "call_KcAjWtAww20AihPHphUh46Gd",
                  "type": "function",
                  "function": {
                      "name": "get_current_weather",
                      "arguments": "{\"location\":\"Boston, MA\"}"
                  }
              }
          ]
      },
      { 
          "role": "tool",
          "content": "The weather is cold",
          "tool_call_id": "call_KcAjWtAww20AihPHphUh46Gd"
      }
  ]
}

Run `POST POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion using a User message with `tools` and `tool_choice`.

{
  "messages": [
      {
          "role": "user",
          "content": [
              {
                  "type": "text",
                  "text": "What's the price of a scarf?"
              }
          ]
      }
  ],
  "tools": [
      {
          "type": "function",
          "function": {
              "name": "get_current_price",
              "description": "Get the current price of a item",
              "parameters": {
                  "type": "object",
                  "properties": {
                      "item": {
                          "id": "123"
                      }
                  }
              }
          }
      }
  ],
  "tool_choice": {
      "type": "function",
      "function": {
          "name": "get_current_price"
      }
  }
}

Response examples (200)

A successful response when performing a chat completion task using a User message with `tools` and `tool_choice`.

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":"","role":"assistant"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":Elastic"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":" is"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

(...)

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk","usage":{"completion_tokens":28,"prompt_tokens":16,"total_tokens":44}}} 

event: message
data: [DONE]

Path parameters

Query parameters

BodyRequired

contentstring | array[object]

tool_choicestring | object

Responses