Invoke Assistant
Get completion for a stored assistant. This endpoint invokes the assistant and returns its response. It uses chat completions on the backend and maintains OpenAI-compatible format. Assistants are stateful AI agents that can leverage various model providers and tools, access persistent conversation threads, and execute multiple tools in parallel.
Note: The response includes an x-langtail-thread-id
header containing the
unique identifier for the thread associated with this response.
Headers
Your Langtail API Key
Path Parameters
Your workspace URL slug
Your project URL slug
Your assistant URL slug
Your environment URL slug
Body
If true, potentially sensitive data like the assistant and response will not be recorded in the logs
Overrides the frequency_penalty of deployed assistant.
Overrides the max tokens of deployed assistant. The maximum number of tokens that can be generated in the completion.
Additional messages. These will be appended to the Thread.
Additional custom data that will be stored for this request
Overrides the model of deployed assistant.
Overrides the presence_penalty of deployed assistant.
Overrides the response format of deployed assistant.
A seed is used to generate reproducible results
Overrides the temperature of deployed assistant. What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
Overrides the stored template messages with custom template messages.
A unique identifier for the thread. If not provided, a new thread will be created.
Overrides the tool choice of deployed assistant.
auto
, required
, none
Overrides the top_p of deployed assistant. An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature
but not both.
A unique identifier representing your end-user
A mapping of variable names to their values. Will be injected in your saved assistant template.
Response
A list of chat completion choices. Can be more than one if n
is greater than 1.
The Unix timestamp (in seconds) of when the chat completion was created.
A unique identifier for the chat completion.
The model used for the chat completion.
The object type, which is always chat.completion
.
Was this page helpful?