POST
/
{workspace}
/
{project}
/
{assistant}
/
{environment}

Note: The response includes an x-langtail-thread-id header containing the unique identifier for the thread associated with this response.

Headers

X-API-Key
string
required

Your Langtail API Key

Path Parameters

workspace
string
required

Your workspace URL slug

project
string
required

Your project URL slug

assistant
string
required

Your assistant URL slug

environment
string
required

Your environment URL slug

Body

application/json
doNotRecord
boolean

If true, potentially sensitive data like the assistant and response will not be recorded in the logs

frequency_penalty
number

Overrides the frequency_penalty of deployed assistant.

max_tokens
number

Overrides the max tokens of deployed assistant. The maximum number of tokens that can be generated in the completion.

messages
object[]

Additional messages. These will be appended to the Thread.

metadata
object

Additional custom data that will be stored for this request

model
string

Overrides the model of deployed assistant.

presence_penalty
number

Overrides the presence_penalty of deployed assistant.

response_format
object

Overrides the response format of deployed assistant.

seed
number

A seed is used to generate reproducible results

stream
boolean
temperature
number

Overrides the temperature of deployed assistant. What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

template
object[]

Overrides the stored template messages with custom template messages.

threadId
string

A unique identifier for the thread. If not provided, a new thread will be created.

tool_choice

Overrides the tool choice of deployed assistant.

Available options:
auto,
required,
none
top_p
number

Overrides the top_p of deployed assistant. An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

user
string

A unique identifier representing your end-user

variables
object

A mapping of variable names to their values. Will be injected in your saved assistant template.

Response

200 - application/json
choices
object[]
required

A list of chat completion choices. Can be more than one if n is greater than 1.

created
number
required

The Unix timestamp (in seconds) of when the chat completion was created.

id
string
required

A unique identifier for the chat completion.

model
string
required

The model used for the chat completion.

object
string
required

The object type, which is always chat.completion.