Generate a chat completion

curl --request POST \
  --url https://paradigm.lighton.ai/api/v3/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {
      "content": "<string>",
      "name": "<string>",
      "function_call": {}
    }
  ]
}
'

import requests

url = "https://paradigm.lighton.ai/api/v3/chat/completions"

payload = {
    "model": "<string>",
    "messages": [
        {
            "content": "<string>",
            "name": "<string>",
            "function_call": {}
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.text)

const options = {
  method: 'POST',
  headers: {Authorization: 'Bearer <token>', 'Content-Type': 'application/json'},
  body: JSON.stringify({
    model: '<string>',
    messages: [{content: '<string>', name: '<string>', function_call: {}}]
  })
};

fetch('https://paradigm.lighton.ai/api/v3/chat/completions', options)
  .then(res => res.json())
  .then(res => console.log(res))
  .catch(err => console.error(err));

{
  "model": "alfred-4.2",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}

Models

Generate a chat completion

This endpoint can be used to generate chat completions from a Large Language Model.

It is a simple proxy forwarding your requests to the desired model.

Any LightOn model is deployed on a vLLM-based image.

Response Types:

When stream=false (default): Returns a complete JSON response with all completion choices
When stream=true: Returns Server-Sent Events (SSE) with incremental completion chunks

Streaming Format:

Each SSE event contains a JSON object with incremental text. The stream ends with data: [DONE].

POST

api

chat

completions

Generate a chat completion

curl --request POST \
  --url https://paradigm.lighton.ai/api/v3/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {
      "content": "<string>",
      "name": "<string>",
      "function_call": {}
    }
  ]
}
'

import requests

url = "https://paradigm.lighton.ai/api/v3/chat/completions"

payload = {
    "model": "<string>",
    "messages": [
        {
            "content": "<string>",
            "name": "<string>",
            "function_call": {}
        }
    ]
}
headers = {
    "Authorization": "Bearer <token>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print(response.text)

const options = {
  method: 'POST',
  headers: {Authorization: 'Bearer <token>', 'Content-Type': 'application/json'},
  body: JSON.stringify({
    model: '<string>',
    messages: [{content: '<string>', name: '<string>', function_call: {}}]
  })
};

fetch('https://paradigm.lighton.ai/api/v3/chat/completions', options)
  .then(res => res.json())
  .then(res => console.log(res))
  .catch(err => console.error(err));

{
  "model": "alfred-4.2",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

Request serializer for chat completions endpoint (OpenAI-compatible).

model

string

required

Model to use for generating chat completions, must exist and be configured from the admin

messages

object[]

required

List of messages comprising the conversation so far

Show child attributes

max_tokens

integer

Maximum number of tokens to generate

temperature

number<double>

Sampling temperature between 0 and 2

top_p

number<double>

Nucleus sampling parameter

integer

Number of chat completion choices to generate

stream

boolean

Whether to stream back partial progress

stop

string[]

Up to 4 sequences where the API will stop generating further tokens

presence_penalty

number<double>

Penalty for new tokens based on whether they appear in the text so far

frequency_penalty

number<double>

Penalty for new tokens based on their existing frequency in the text

logit_bias

object

Modify the likelihood of specified tokens appearing in the completion

Show child attributes

user

string

A unique identifier representing your end-user

functions

object[]

List of functions the model may call

Show child attributes

function_call

string

Controls how the model responds to function calls

Response

200 - application/json

Response serializer for chat completions endpoint results.

string

required

Unique identifier for the chat completion

object

string

required

Object type, always 'chat.completion'

created

integer

required

Unix timestamp of when the chat completion was created

model

string

required

The model used for generating the chat completion

choices

object[]

required

List of chat completion choices generated by the model

Show child attributes

usage

object

Usage statistics for the chat completion request

Show child attributes

List native tools Create embeddings

Agents

Threads

Tools

Models

MCP

Sources

Artifacts

Agent

Files

Files Processing

Tags

Workspaces

Users

User Groups

Companies

SCIM

Facets

Generate a chat completion

Authorizations

Body

Response