Path parameters

  • inference_idstring Required

    The inference Id

application/json

BodyRequired

  • Chunking configuration object

    Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlapnumber

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategystring

      The chunking strategy: sentence or word.

  • servicestring Required

    The service type

  • service_settingsobject Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object

    Represents an inference endpoint as returned by the GET API

    • Chunking configuration object

      Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlapnumber

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategystring

        The chunking strategy: sentence or word.

    • servicestring Required

      The service type

    • service_settingsobject Required
    • inference_idstring Required

      The inference Id

    • task_typestring Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{inference_id}
PUT _inference/rerank/my-rerank-model
{
 "service": "cohere",
 "service_settings": {
   "model_id": "rerank-english-v3.0",
   "api_key": "{{COHERE_API_KEY}}"
 }
}
resp = client.inference.put(
    task_type="rerank",
    inference_id="my-rerank-model",
    inference_config={
        "service": "cohere",
        "service_settings": {
            "model_id": "rerank-english-v3.0",
            "api_key": "{{COHERE_API_KEY}}"
        }
    },
)
const response = await client.inference.put({
  task_type: "rerank",
  inference_id: "my-rerank-model",
  inference_config: {
    service: "cohere",
    service_settings: {
      model_id: "rerank-english-v3.0",
      api_key: "{{COHERE_API_KEY}}",
    },
  },
});
response = client.inference.put(
  task_type: "rerank",
  inference_id: "my-rerank-model",
  body: {
    "service": "cohere",
    "service_settings": {
      "model_id": "rerank-english-v3.0",
      "api_key": "{{COHERE_API_KEY}}"
    }
  }
)
$resp = $client->inference()->put([
    "task_type" => "rerank",
    "inference_id" => "my-rerank-model",
    "body" => [
        "service" => "cohere",
        "service_settings" => [
            "model_id" => "rerank-english-v3.0",
            "api_key" => "{{COHERE_API_KEY}}",
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"service":"cohere","service_settings":{"model_id":"rerank-english-v3.0","api_key":"{{COHERE_API_KEY}}"}}' "$ELASTICSEARCH_URL/_inference/rerank/my-rerank-model"
Request example
An example body for a `PUT _inference/rerank/my-rerank-model` request.
{
 "service": "cohere",
 "service_settings": {
   "model_id": "rerank-english-v3.0",
   "api_key": "{{COHERE_API_KEY}}"
 }
}