Path parameters

  • inference_idstring Required

    The unique identifier for the inference endpoint.

Query parameters

  • timeoutstring

    The amount of time to wait for the inference request to complete.

    Values are -1 or 0.

application/json

Body

  • querystring

    The query input, which is required only for the rerank task. It is not required for other tasks.

  • inputstring | array[string] Required

    The text on which you want to perform the inference task. It can be a single string or an array.


    Inference endpoints for the completion task type currently only support a single string as input.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide text_embedding_bytes attribute Show text_embedding_bytes attribute object
      • embeddingarray[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding_bitsarray[object]
      Hide text_embedding_bits attribute Show text_embedding_bits attribute object
      • embeddingarray[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embeddingarray[object]
      Hide text_embedding attribute Show text_embedding attribute object
      • embeddingarray[number] Required

        Text Embedding results are represented as Dense Vectors of floats.

    • sparse_embeddingarray[object]
      Hide sparse_embedding attribute Show sparse_embedding attribute object
      • embeddingobject Required

        Sparse Embedding tokens are represented as a dictionary of string to double.

        Hide embedding attribute Show embedding attribute object
        • *number Additional properties
    • completionarray[object]
      Hide completion attribute Show completion attribute object
    • rerankarray[object]
      Hide rerank attributes Show rerank attributes object
POST /_inference/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":"string","input":"string","task_settings":{}}'