Upload Files to Paradigm

Overview

The V3 Files API provides a simple, one-step upload process for adding documents to your Paradigm workspace. Upload a file with a single API call, and Paradigm handles the rest - parsing, indexing, and making your documents searchable. Key features:

Asynchronous processing - Files are queued and processed in the background, so uploads return immediately
Automatic upload sessions - Files are automatically grouped into sessions for efficient batch processing
Direct tag assignment - Organize documents by applying tags during upload
Flexible configuration - Override parser selection or use automatic detection (default)
Progress tracking - Monitor processing status via GET endpoints

Prerequisites

Required

Paradigm API key: Generate one at /settings/api-key in your Paradigm instance
Workspace ID: The ID of the workspace where documents will be stored

How to Get Your Workspace ID

You can find your workspace ID in several ways:

From the admin panel: Navigate to your workspace in the admin interface and check the URL or workspace details
From the API: List the workspaces you have access to with GET /api/v3/workspaces

curl $PARADIGM_BASE_URL/api/v3/workspaces \
  -H "Authorization: Bearer $PARADIGM_API_KEY"

File Requirements

Maximum file size: 25MB per file by default (or custom with MAX_DOCUMENT_SIZE config key of your instance)
Supported formats: PDF, DOCX, DOC, PPTX, PPT, TXT, MD, Markdown, HTML, XLSX, XLS, CSV, RTF, ODT, ODS, ODP and more

Quick Start

The simplest upload requires just a file and workspace ID:

Python

import requests
import os

api_key = os.getenv("PARADIGM_API_KEY")
base_url = os.getenv("PARADIGM_BASE_URL", "https://paradigm.lighton.ai")

response = requests.post(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    files={"file": open("document.pdf", "rb")},
    data={"workspace_id": 42}
)

print(response.json())

cURL

curl $PARADIGM_BASE_URL/api/v3/files \
  -H "Authorization: Bearer $PARADIGM_API_KEY" \
  -F "file=@document.pdf" \
  -F "workspace_id=42"

Response

{
  "id": 12345,
  "filename": "document.pdf",
  "workspace": {"id": 42, "name": "My Workspace", "workspace_type": "custom"},
  "summaries": [],
  "title": "document",
  "extension": "pdf",
  "status": "pending",
  "status_vision": null,
  "created_at": "2025-03-01T10:30:00Z",
  "updated_at": "2025-03-01T10:30:00Z",
  "total_pages": 0,
  "tags": [],
  "created_by": {"id": 1, "first_name": "Jane", "last_name": "Doe", "username": "jdoe"},
  "upload_session_uuid": "550e8400-e29b-41d4-a716-446655440000",
  "external_metadata": null,
  "message": "File queued for processing"
}

Upload Parameters

Required Parameters

Parameter	Type	Description
`file`	binary	The file to upload
`workspace_id`	integer	Workspace where the document will be stored

Optional Parameters

Parameter	Type	Description
`title`	string	Custom title for the document (defaults to filename without extension)
`filename`	string	Override the uploaded filename
`parser`	string	Specify ingestion pipeline (e.g., “v2.2.1”, “v2.1”) - defaults to automatic selection
`tags`	array of integers	Tag IDs to assign to the document on upload (tags must belong to your company and you must have permission to use them)

Examples with Optional Parameters

Custom title and tags:

response = requests.post(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    files={"file": open("Q4_report.pdf", "rb")},
    data={
        "workspace_id": 42,
        "title": "Q4 Financial Report 2025",
        "tags": [1, 2]  # Tag IDs
    }
)

Custom parser:

response = requests.post(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    files={"file": open("technical_doc.pdf", "rb")},
    data={
        "workspace_id": 42,
        "parser": "v2.2.1"
    }
)

Tracking Upload Status

After uploading, files are processed asynchronously. Track their progress using the file ID.

Check Individual File Status

file_id = 12345
response = requests.get(
    f"{base_url}/api/v3/files/{file_id}",
    headers={"Authorization": f"Bearer {api_key}"}
)

status = response.json()["status"]
print(f"Processing status: {status}")

Understanding Status Values

Status	Description
`pending`	File uploaded, waiting to be processed
`parsing`	Currently being parsed and processed
`embedded`	Successfully processed and available for search
`failed`	Processing failed (check `status_detail` field for error)

Batch Upload: Multiple Files

When uploading multiple files, Paradigm automatically creates an upload session to group your files together. Files are queued and processed asynchronously in the background, allowing you to upload large batches without waiting for processing to complete. Each file upload returns an upload_session_uuid that you can use to track all files in the batch. The upload session handles rate limiting and ensures efficient processing of your documents. For uploading many files efficiently, use the provided batch upload script or implement your own concurrent upload logic.

Track All Files in a Batch Upload

After uploading multiple files, you can filter by the upload_session_uuid returned in each upload response:

upload_session_uuid = "550e8400-e29b-41d4-a716-446655440000"

response = requests.get(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    params={"upload_session_uuid": upload_session_uuid}
)

files = response.json()["results"]
for file in files:
    print(f"{file['filename']}: {file['status']}")

This is particularly useful when monitoring the progress of batch uploads.

Using the Batch Upload Script

batch_upload_v3.py

Production-ready async batch upload script with concurrent uploads, progress tracking, and resume capability.

Requirements:

Python 3.10 or higher
uv — dependencies are installed automatically when running with uv run

Basic usage:

uv run batch_upload_v3.py \
  --api-key="your_api_key" \
  --base-url="https://paradigm.lighton.ai" \
  --files-dir="/path/to/documents" \
  --workspace-id=42

With options:

uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --batch-size=20 \
  --max-fails=5 \
  --tags=1,2,3 \
  --state-file="upload_state.json"

Upload only specific file types:

# Only upload PDFs and Word documents
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --include-extensions=pdf,docx,doc

# Upload all files except temporary files and system files
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --exclude-extensions=tmp,log,.DS_Store,.gitkeep

Script Arguments

Argument	Description	Default
`--api-key`	Paradigm API key (or set `PARADIGM_API_KEY` env var)	Required
`--base-url`	Paradigm instance URL (or set `PARADIGM_BASE_URL` env var)	`https://paradigm.lighton.ai`
`--files-dir`	Directory containing files to upload (scans recursively)	Required
`--workspace-id`	Workspace ID where files will be stored	Required
`--batch-size`	Number of concurrent uploads (max: 50)	`10`
`--max-fails`	Stop after N failures (must be >= 1)	`1`
`--tags`	Comma-separated tag IDs to apply to all files	None
`--state-file`	JSON file to track progress and enable resume	None
`--include-extensions`	Only upload files with these extensions or filenames (e.g., `pdf,docx,txt`)	None (all files)
`--exclude-extensions`	Skip files with these extensions or filenames (e.g., `tmp,log,.DS_Store`)	None

Script Features

High throughput - concurrent uploads optimized for speed (default: 10 concurrent, max: 50)
Recursive scanning - automatically finds all files in subdirectories
Progress tracking - real-time progress bar with upload statistics
Error resilience - stops after first failure by default (configurable with --max-fails)
Smart error handling - automatically skips files with unsupported extensions and files >100MB (doesn’t count as failures)
Resume capability - use --state-file to resume interrupted uploads
Bulk tagging - apply tags to all uploaded files automatically

Example: Resume After Interruption

If your upload was interrupted, resume using the state file:

# First attempt (interrupted)
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --state-file="upload_state.json"

# Resume (skips already uploaded files)
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --state-file="upload_state.json"

Migration from API V2

If you’re currently using the V2 upload API, here’s what changed:

What’s Different

V2 required two steps:

Create an upload session: POST /api/v2/upload-session
Upload files to the session: POST /api/v2/upload-session/{uuid}

V3 is a single step:

Just upload directly: POST /api/v3/files

What Was Removed

The following concepts from V2 are no longer needed:

Upload session management - Sessions are created automatically in the background
Collection types - Simply use workspace_id instead of collection_type and collection
OCR configuration - Processing settings are applied automatically (or override with parser parameter)
Session activation/deactivation - Handled automatically
purpose field - No longer needed

What’s New

V3 adds new capabilities not available in V2:

Direct tag assignment - Use the tags parameter to tag documents on upload
Simplified status tracking - Filter files by upload_session_uuid to track batch uploads

Tracking Progress

V2: Track session status with GET /api/v2/upload-session/{uuid} V3: Filter files by upload session UUID:

# Get upload session UUID from upload response
upload_session_uuid = response.json()["upload_session_uuid"]

# Track all files in this batch
files = requests.get(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    params={"upload_session_uuid": upload_session_uuid}
).json()["results"]

The V3 API is more intuitive and requires less code while maintaining the same reliability and performance as V2.

API Fundamentals

Agents & Threads

Chat & AI Models

LightOn models

Document Management

Overview

Prerequisites

Required

How to Get Your Workspace ID

File Requirements

Quick Start

Python

cURL

Response

Upload Parameters

Required Parameters

Optional Parameters

Examples with Optional Parameters

Tracking Upload Status

Check Individual File Status

Understanding Status Values

Batch Upload: Multiple Files

Track All Files in a Batch Upload

Using the Batch Upload Script

batch_upload_v3.py

Script Arguments

Script Features

Example: Resume After Interruption

Migration from API V2

What’s Different

What Was Removed

What’s New

Tracking Progress

API Fundamentals

Agents & Threads

Chat & AI Models

LightOn models

Document Management

Documentation Index

​Overview

​Prerequisites

​Required

​How to Get Your Workspace ID

​File Requirements

​Quick Start

​Python

​cURL

​Response

​Upload Parameters

​Required Parameters

​Optional Parameters

​Examples with Optional Parameters

​Tracking Upload Status

​Check Individual File Status

​Understanding Status Values

​Batch Upload: Multiple Files

​Track All Files in a Batch Upload

​Using the Batch Upload Script

batch_upload_v3.py

​Script Arguments

​Script Features

​Example: Resume After Interruption

​Migration from API V2

​What’s Different

​What Was Removed

​What’s New

​Tracking Progress

Overview

Prerequisites

Required

How to Get Your Workspace ID

File Requirements

Quick Start

Python

cURL

Response

Upload Parameters

Required Parameters

Optional Parameters

Examples with Optional Parameters

Tracking Upload Status

Check Individual File Status

Understanding Status Values

Batch Upload: Multiple Files

Track All Files in a Batch Upload

Using the Batch Upload Script

Script Arguments

Script Features

Example: Resume After Interruption

Migration from API V2

What’s Different

What Was Removed

What’s New

Tracking Progress