Documentation Index Fetch the complete documentation index at: https://docs.lighton.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The V3 Files API provides a simple, one-step upload process for adding documents to your Paradigm workspace. Upload a file with a single API call, and Paradigm handles the rest - parsing, indexing, and making your documents searchable.
Key features:
Asynchronous processing - Files are queued and processed in the background, so uploads return immediately
Automatic upload sessions - Files are automatically grouped into sessions for efficient batch processing
Direct tag assignment - Organize documents by applying tags during upload
Flexible configuration - Override parser selection or use automatic detection (default)
Progress tracking - Monitor processing status via GET endpoints
Prerequisites
Required
Paradigm API key : Generate one at /settings/api-key in your Paradigm instance
Workspace ID : The ID of the workspace where documents will be stored
How to Get Your Workspace ID
You can find your workspace ID in several ways:
From the admin panel : Navigate to your workspace in the admin interface and check the URL or workspace details
From the API : List the workspaces you have access to with GET /api/v3/workspaces
curl $PARADIGM_BASE_URL /api/v3/workspaces \
-H "Authorization: Bearer $PARADIGM_API_KEY "
File Requirements
Maximum file size : 25MB per file by default (or custom with MAX_DOCUMENT_SIZE config key of your instance)
Supported formats : PDF, DOCX, DOC, PPTX, PPT, TXT, MD, Markdown, HTML, XLSX, XLS, CSV, RTF, ODT, ODS, ODP and more
Quick Start
The simplest upload requires just a file and workspace ID:
Python
import requests
import os
api_key = os.getenv( "PARADIGM_API_KEY" )
base_url = os.getenv( "PARADIGM_BASE_URL" , "https://paradigm.lighton.ai" )
response = requests.post(
f " { base_url } /api/v3/files" ,
headers = { "Authorization" : f "Bearer { api_key } " },
files = { "file" : open ( "document.pdf" , "rb" )},
data = { "workspace_id" : 42 }
)
print (response.json())
cURL
curl $PARADIGM_BASE_URL /api/v3/files \
-H "Authorization: Bearer $PARADIGM_API_KEY " \
-F "file=@document.pdf" \
-F "workspace_id=42"
Response
{
"id" : 12345 ,
"filename" : "document.pdf" ,
"workspace" : { "id" : 42 , "name" : "My Workspace" , "workspace_type" : "custom" },
"summaries" : [],
"title" : "document" ,
"extension" : "pdf" ,
"status" : "pending" ,
"status_vision" : null ,
"created_at" : "2025-03-01T10:30:00Z" ,
"updated_at" : "2025-03-01T10:30:00Z" ,
"total_pages" : 0 ,
"tags" : [],
"created_by" : { "id" : 1 , "first_name" : "Jane" , "last_name" : "Doe" , "username" : "jdoe" },
"upload_session_uuid" : "550e8400-e29b-41d4-a716-446655440000" ,
"external_metadata" : null ,
"message" : "File queued for processing"
}
Upload Parameters
Required Parameters
Parameter Type Description filebinary The file to upload workspace_idinteger Workspace where the document will be stored
Optional Parameters
Parameter Type Description titlestring Custom title for the document (defaults to filename without extension) filenamestring Override the uploaded filename parserstring Specify ingestion pipeline (e.g., “v2.2.1”, “v2.1”) - defaults to automatic selection tagsarray of integers Tag IDs to assign to the document on upload (tags must belong to your company and you must have permission to use them)
Examples with Optional Parameters
Custom title and tags:
response = requests.post(
f " { base_url } /api/v3/files" ,
headers = { "Authorization" : f "Bearer { api_key } " },
files = { "file" : open ( "Q4_report.pdf" , "rb" )},
data = {
"workspace_id" : 42 ,
"title" : "Q4 Financial Report 2025" ,
"tags" : [ 1 , 2 ] # Tag IDs
}
)
Custom parser:
response = requests.post(
f " { base_url } /api/v3/files" ,
headers = { "Authorization" : f "Bearer { api_key } " },
files = { "file" : open ( "technical_doc.pdf" , "rb" )},
data = {
"workspace_id" : 42 ,
"parser" : "v2.2.1"
}
)
Tracking Upload Status
After uploading, files are processed asynchronously. Track their progress using the file ID.
Check Individual File Status
file_id = 12345
response = requests.get(
f " { base_url } /api/v3/files/ { file_id } " ,
headers = { "Authorization" : f "Bearer { api_key } " }
)
status = response.json()[ "status" ]
print ( f "Processing status: { status } " )
Understanding Status Values
Status Description pendingFile uploaded, waiting to be processed parsingCurrently being parsed and processed embeddedSuccessfully processed and available for search failedProcessing failed (check status_detail field for error)
Batch Upload: Multiple Files
When uploading multiple files, Paradigm automatically creates an upload session to group your files together. Files are queued and processed asynchronously in the background, allowing you to upload large batches without waiting for processing to complete.
Each file upload returns an upload_session_uuid that you can use to track all files in the batch. The upload session handles rate limiting and ensures efficient processing of your documents.
For uploading many files efficiently, use the provided batch upload script or implement your own concurrent upload logic.
Track All Files in a Batch Upload
After uploading multiple files, you can filter by the upload_session_uuid returned in each upload response:
upload_session_uuid = "550e8400-e29b-41d4-a716-446655440000"
response = requests.get(
f " { base_url } /api/v3/files" ,
headers = { "Authorization" : f "Bearer { api_key } " },
params = { "upload_session_uuid" : upload_session_uuid}
)
files = response.json()[ "results" ]
for file in files:
print ( f " { file [ 'filename' ] } : { file [ 'status' ] } " )
This is particularly useful when monitoring the progress of batch uploads.
Using the Batch Upload Script
batch_upload_v3.py Production-ready async batch upload script with concurrent uploads, progress tracking, and resume capability.
Requirements:
Python 3.10 or higher
uv — dependencies are installed automatically when running with uv run
Basic usage:
uv run batch_upload_v3.py \
--api-key= "your_api_key" \
--base-url= "https://paradigm.lighton.ai" \
--files-dir= "/path/to/documents" \
--workspace-id=42
With options:
uv run batch_upload_v3.py \
--files-dir= "/path/to/documents" \
--workspace-id=42 \
--batch-size=20 \
--max-fails=5 \
--tags=1,2,3 \
--state-file= "upload_state.json"
Upload only specific file types:
# Only upload PDFs and Word documents
uv run batch_upload_v3.py \
--files-dir= "/path/to/documents" \
--workspace-id=42 \
--include-extensions=pdf,docx,doc
# Upload all files except temporary files and system files
uv run batch_upload_v3.py \
--files-dir= "/path/to/documents" \
--workspace-id=42 \
--exclude-extensions=tmp,log,.DS_Store,.gitkeep
Script Arguments
Argument Description Default --api-keyParadigm API key (or set PARADIGM_API_KEY env var) Required --base-urlParadigm instance URL (or set PARADIGM_BASE_URL env var) https://paradigm.lighton.ai--files-dirDirectory containing files to upload (scans recursively) Required --workspace-idWorkspace ID where files will be stored Required --batch-sizeNumber of concurrent uploads (max: 50) 10--max-failsStop after N failures (must be >= 1) 1--tagsComma-separated tag IDs to apply to all files None --state-fileJSON file to track progress and enable resume None --include-extensionsOnly upload files with these extensions or filenames (e.g., pdf,docx,txt) None (all files) --exclude-extensionsSkip files with these extensions or filenames (e.g., tmp,log,.DS_Store) None
Script Features
High throughput - concurrent uploads optimized for speed (default: 10 concurrent, max: 50)
Recursive scanning - automatically finds all files in subdirectories
Progress tracking - real-time progress bar with upload statistics
Error resilience - stops after first failure by default (configurable with --max-fails)
Smart error handling - automatically skips files with unsupported extensions and files >100MB (doesn’t count as failures)
Resume capability - use --state-file to resume interrupted uploads
Bulk tagging - apply tags to all uploaded files automatically
Example: Resume After Interruption
If your upload was interrupted, resume using the state file:
# First attempt (interrupted)
uv run batch_upload_v3.py \
--files-dir= "/path/to/documents" \
--workspace-id=42 \
--state-file= "upload_state.json"
# Resume (skips already uploaded files)
uv run batch_upload_v3.py \
--files-dir= "/path/to/documents" \
--workspace-id=42 \
--state-file= "upload_state.json"
Migration from API V2
If you’re currently using the V2 upload API, here’s what changed:
What’s Different
V2 required two steps:
Create an upload session: POST /api/v2/upload-session
Upload files to the session: POST /api/v2/upload-session/{uuid}
V3 is a single step:
Just upload directly: POST /api/v3/files
What Was Removed
The following concepts from V2 are no longer needed:
Upload session management - Sessions are created automatically in the background
Collection types - Simply use workspace_id instead of collection_type and collection
OCR configuration - Processing settings are applied automatically (or override with parser parameter)
Session activation/deactivation - Handled automatically
purpose field - No longer needed
What’s New
V3 adds new capabilities not available in V2:
Direct tag assignment - Use the tags parameter to tag documents on upload
Simplified status tracking - Filter files by upload_session_uuid to track batch uploads
Tracking Progress
V2: Track session status with GET /api/v2/upload-session/{uuid}
V3: Filter files by upload session UUID:
# Get upload session UUID from upload response
upload_session_uuid = response.json()[ "upload_session_uuid" ]
# Track all files in this batch
files = requests.get(
f " { base_url } /api/v3/files" ,
headers = { "Authorization" : f "Bearer { api_key } " },
params = { "upload_session_uuid" : upload_session_uuid}
).json()[ "results" ]
The V3 API is more intuitive and requires less code while maintaining the same reliability and performance as V2.