Path parameters

  • task_typestring Required

    The type of inference task that the model performs.

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_idstring Required

    The unique identifier of the inference endpoint.

application/json

BodyRequired

  • Chunking configuration object

    Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlapnumber

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategystring

      The chunking strategy: sentence or word.

  • servicestring Required

    The service type

  • service_settingsobject Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object

    Represents an inference endpoint as returned by the GET API

    • Chunking configuration object

      Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlapnumber

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategystring

        The chunking strategy: sentence or word.

    • servicestring Required

      The service type

    • service_settingsobject Required
    • inference_idstring Required

      The inference Id

    • task_typestring Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{inference_id}/_update
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{inference_id}/_update' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n \"service_settings\": {\n   \"api_key\": \"\u003cAPI_KEY\u003e\"\n }\n}"'
Request example
An example body for a `PUT _inference/my-inference-endpoint/_update` request.
{
 "service_settings": {
   "api_key": "<API_KEY>"
 }
}