To an HTTP request, the API returns all results formatted in JSON.
The response is made of three parts:
request_id, uniquely identifying the request.
outputslist, containing the model answer to your request, and useful metadata.
execution_metadatadictionary, with the detailed cost of the request.
Let's have a detailed look at the request example generated on the Requests page.
Example response (JSON)
"input_text": "Il était une fois",
"output_text": ", un pays où il faisait toujours beau.\nLes gens y vivaient heureux et en sécurité. Ils avaient de l'argent",
execution_metadata dictionary collects information relevant to the cost and execution of the request. It is
available at the top level, as well as for each individual element of a batch.
It contains a
costentry, which is a dictionary containing the detailed total cost of the request:
tokens_used, the number of tokens used (sum of the next two fields).
tokens_input, the number of tokens sent in input to the model.
tokens_generated, the number of tokens generated by the model.
cost_typein the form
model_name@skill, indicating the nature of the tokens used (if no skills are used, it will be replaced by
batch_sizethe number of requests made in a single batch.
finish_reasonentry, explaining why the model stopped processing further tokens (
lengthif stopped by
n_tokensor by reaching the end of the text to process, or
stop_wordif reached one of the
score dictionary provides information regarding the log-probabilities of the tokens processed:
logprobis the overall log-probability of the entire text processed.
normalized_logprobis the same as above, but normalized for text length (number of tokens).
token_logprobsis a dictionary including the specific log-probability of each token.
Response to batched requests
outputs list will be structured according to how you have batched your request:
- It will contain one separate list for each set of parameters you have submitted.
- Each list will contain one entry per entry in your batch.