Skip to main content
POST
/
api
/
v3
/
ocr
Parse a document to Markdown via VLM
curl --request POST \
  --url https://paradigm-preprod.lighton.ai/api/v3/ocr \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'file=<string>'
{
  "model": "LightOnOCR",
  "total_pages": 3,
  "pages_parsed": [
    1,
    2,
    3
  ],
  "processing_time_ms": 4520,
  "enable_antilooping": true,
  "sampling_params": {
    "temperature": 0.2,
    "max_tokens": 5888,
    "repetition_penalty": null
  },
  "pages": [
    {
      "page_number": 1,
      "markdown": "# Invoice\n\n| Item | Qty | Price |\n|---|---|---|\n| Widget A | 10 | $5.00 |"
    },
    {
      "page_number": 2,
      "markdown": "## Terms and Conditions\n\nPayment is due within 30 days..."
    },
    {
      "page_number": 3,
      "markdown": "## Appendix\n\n![Figure 1: Sales chart summary]"
    }
  ]
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

file
string<uri>
required

The document to parse.

model
string | null

technical_name of an enabled parser model. Falls back to platform default.

pages
string
default:all

Page range to parse. Formats: "all", "1-5", "1,3,7", "2-4,8".

enable_antilooping
boolean
default:true

When enabled, detects repetitive generation loops and gradually increases the sampling temperature to break out of them.

temperature
number<double> | null

Controls the randomness of the model output. Lower values (e.g. 0.1) produce more deterministic results, higher values (e.g. 1.0) increase variety. Range: 0.0–2.0.

Required range: 0 <= x <= 2
max_tokens
integer | null

Maximum number of tokens the model can generate per page. Higher values allow longer outputs but increase processing time. Range: 1–16384.

Required range: 1 <= x <= 16384
repetition_penalty
number<double> | null

Penalizes repeated tokens to reduce redundant output. A value of 1.0 applies no penalty; higher values (e.g. 1.2) discourage repetition more strongly. Range: 1.0–2.0.

Required range: 1 <= x <= 2

Response

Document parsed successfully.

model
string
required
total_pages
integer
required
pages_parsed
integer[]
required
processing_time_ms
integer
required
enable_antilooping
boolean
required
sampling_params
object
required
pages
object[]
required