Invoke Assistant
Get completion for a stored assistant. This endpoint invokes the assistant and returns its response. It uses chat completions on the backend and maintains OpenAI-compatible format. Assistants are stateful AI agents that can leverage various model providers and tools, access persistent conversation threads, and execute multiple tools in parallel.
Note: The response includes an x-langtail-thread-id
header containing the
unique identifier for the thread associated with this response.
Headers
Your Langtail API Key
Path Parameters
Your workspace URL slug
Your project URL slug
Your assistant URL slug
Your environment URL slug
Body
A unique identifier for the thread. If not provided, a new thread will be created.
A mapping of variable names to their values. Will be injected in your saved assistant template.
Additional messages. These will be appended to the Thread.
Overrides the model of deployed assistant.
Overrides the max tokens of deployed assistant. The maximum number of tokens that can be generated in the completion.
Overrides the temperature of deployed assistant. What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
Overrides the top_p of deployed assistant. An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature
but not both.
Overrides the presence_penalty of deployed assistant.
Overrides the frequency_penalty of deployed assistant.
Overrides the stored template messages with custom template messages.
Overrides the tool choice of deployed assistant.
auto
, required
, none
Overrides the response format of deployed assistant.
A unique identifier representing your end-user
If true, potentially sensitive data like the assistant and response will not be recorded in the logs
Additional custom data that will be stored for this request
A seed is used to generate reproducible results
Response
A unique identifier for the chat completion.
The object type, which is always chat.completion
.
The Unix timestamp (in seconds) of when the chat completion was created.
The model used for the chat completion.
A list of chat completion choices. Can be more than one if n
is greater than 1.
Was this page helpful?