Skip to main content

Overview

The V3 Files API provides a simple, one-step upload process for adding documents to your Paradigm workspace. Upload a file with a single API call, and Paradigm handles the rest - parsing, indexing, and making your documents searchable. Key features:
  • Asynchronous processing - Files are queued and processed in the background, so uploads return immediately
  • Automatic upload sessions - Files are automatically grouped into sessions for efficient batch processing
  • Direct tag assignment - Organize documents by applying tags during upload
  • Flexible configuration - Override parser selection or use automatic detection (default)
  • Progress tracking - Monitor processing status via GET endpoints

Prerequisites

Required

  • Paradigm API key: Generate one at /settings/api-key in your Paradigm instance
  • Workspace ID: The ID of the workspace where documents will be stored

How to Get Your Workspace ID

You can find your workspace ID in several ways:
  1. From the admin panel: Navigate to your workspace in the admin interface and check the URL or workspace details
  2. From the API: List the workspaces you have access to with GET /api/v3/workspaces
curl $PARADIGM_BASE_URL/api/v3/workspaces \
  -H "Authorization: Bearer $PARADIGM_API_KEY"

File Requirements

  • Maximum file size: 25MB per file by default (or custom with MAX_DOCUMENT_SIZE config key of your instance)
  • Supported formats: PDF, DOCX, DOC, PPTX, PPT, TXT, MD, Markdown, HTML, XLSX, XLS, CSV, RTF, ODT, ODS, ODP and more

Quick Start

The simplest upload requires just a file and workspace ID:

Python

import requests
import os

api_key = os.getenv("PARADIGM_API_KEY")
base_url = os.getenv("PARADIGM_BASE_URL", "https://paradigm.lighton.ai")

response = requests.post(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    files={"file": open("document.pdf", "rb")},
    data={"workspace_id": 42}
)

print(response.json())

cURL

curl $PARADIGM_BASE_URL/api/v3/files \
  -H "Authorization: Bearer $PARADIGM_API_KEY" \
  -F "file=@document.pdf" \
  -F "workspace_id=42"

Response

{
  "id": 12345,
  "filename": "document.pdf",
  "workspace_id": 42,
  "title": "document",
  "extension": "pdf",
  "status": "pending",
  "status_vision": null,
  "uploaded_at": "2025-03-01T10:30:00Z",
  "updated_at": "2025-03-01T10:30:00Z",
  "total_pages": 0,
  "tags": [],
  "upload_session_uuid": "550e8400-e29b-41d4-a716-446655440000",
  "message": "File queued for processing"
}

Upload Parameters

Required Parameters

ParameterTypeDescription
filebinaryThe file to upload
workspace_idintegerWorkspace where the document will be stored

Optional Parameters

ParameterTypeDescription
titlestringCustom title for the document (defaults to filename without extension)
filenamestringOverride the uploaded filename
parserstringSpecify ingestion pipeline (e.g., “v2.2.1”, “v2.1”) - defaults to automatic selection
tagsarray of integersTag IDs to assign to the document on upload (tags must belong to your company and you must have permission to use them)

Examples with Optional Parameters

Custom title and tags:
response = requests.post(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    files={"file": open("Q4_report.pdf", "rb")},
    data={
        "workspace_id": 42,
        "title": "Q4 Financial Report 2025",
        "tags": [1, 2]  # Tag IDs
    }
)
Custom parser:
response = requests.post(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    files={"file": open("technical_doc.pdf", "rb")},
    data={
        "workspace_id": 42,
        "parser": "v2.2.1"
    }
)

Tracking Upload Status

After uploading, files are processed asynchronously. Track their progress using the file ID.

Check Individual File Status

file_id = 12345
response = requests.get(
    f"{base_url}/api/v3/files/{file_id}",
    headers={"Authorization": f"Bearer {api_key}"}
)

status = response.json()["status"]
print(f"Processing status: {status}")

Understanding Status Values

StatusDescription
pendingFile uploaded, waiting to be processed
parsingCurrently being parsed and processed
embeddedSuccessfully processed and available for search
failedProcessing failed (check status_detail field for error)

Batch Upload: Multiple Files

When uploading multiple files, Paradigm automatically creates an upload session to group your files together. Files are queued and processed asynchronously in the background, allowing you to upload large batches without waiting for processing to complete. Each file upload returns an upload_session_uuid that you can use to track all files in the batch. The upload session handles rate limiting and ensures efficient processing of your documents. For uploading many files efficiently, use the provided batch upload script or implement your own concurrent upload logic.

Track All Files in a Batch Upload

After uploading multiple files, you can filter by the upload_session_uuid returned in each upload response:
upload_session_uuid = "550e8400-e29b-41d4-a716-446655440000"

response = requests.get(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    params={"upload_session_uuid": upload_session_uuid}
)

files = response.json()["results"]
for file in files:
    print(f"{file['filename']}: {file['status']}")
This is particularly useful when monitoring the progress of batch uploads.

Using the Batch Upload Script

batch_upload_v3.py

Production-ready async batch upload script with concurrent uploads, progress tracking, and resume capability.
Requirements:
  • Python 3.10 or higher
  • uv — dependencies are installed automatically when running with uv run
Basic usage:
uv run batch_upload_v3.py \
  --api-key="your_api_key" \
  --base-url="https://paradigm.lighton.ai" \
  --files-dir="/path/to/documents" \
  --workspace-id=42
With options:
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --batch-size=20 \
  --max-fails=5 \
  --tags=1,2,3 \
  --state-file="upload_state.json"
Upload only specific file types:
# Only upload PDFs and Word documents
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --include-extensions=pdf,docx,doc

# Upload all files except temporary files and system files
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --exclude-extensions=tmp,log,.DS_Store,.gitkeep

Script Arguments

ArgumentDescriptionDefault
--api-keyParadigm API key (or set PARADIGM_API_KEY env var)Required
--base-urlParadigm instance URL (or set PARADIGM_BASE_URL env var)https://paradigm.lighton.ai
--files-dirDirectory containing files to upload (scans recursively)Required
--workspace-idWorkspace ID where files will be storedRequired
--batch-sizeNumber of concurrent uploads (max: 50)10
--max-failsStop after N failures (must be >= 1)1
--tagsComma-separated tag IDs to apply to all filesNone
--state-fileJSON file to track progress and enable resumeNone
--include-extensionsOnly upload files with these extensions or filenames (e.g., pdf,docx,txt)None (all files)
--exclude-extensionsSkip files with these extensions or filenames (e.g., tmp,log,.DS_Store)None

Script Features

  • High throughput - concurrent uploads optimized for speed (default: 10 concurrent, max: 50)
  • Recursive scanning - automatically finds all files in subdirectories
  • Progress tracking - real-time progress bar with upload statistics
  • Error resilience - stops after first failure by default (configurable with --max-fails)
  • Smart error handling - automatically skips files with unsupported extensions and files >100MB (doesn’t count as failures)
  • Resume capability - use --state-file to resume interrupted uploads
  • Bulk tagging - apply tags to all uploaded files automatically

Example: Resume After Interruption

If your upload was interrupted, resume using the state file:
# First attempt (interrupted)
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --state-file="upload_state.json"

# Resume (skips already uploaded files)
uv run batch_upload_v3.py \
  --files-dir="/path/to/documents" \
  --workspace-id=42 \
  --state-file="upload_state.json"

Migration from API V2

If you’re currently using the V2 upload API, here’s what changed:

What’s Different

V2 required two steps:
  1. Create an upload session: POST /api/v2/upload-session
  2. Upload files to the session: POST /api/v2/upload-session/{uuid}
V3 is a single step:
  • Just upload directly: POST /api/v3/files

What Was Removed

The following concepts from V2 are no longer needed:
  • Upload session management - Sessions are created automatically in the background
  • Collection types - Simply use workspace_id instead of collection_type and collection
  • OCR configuration - Processing settings are applied automatically (or override with parser parameter)
  • Session activation/deactivation - Handled automatically
  • purpose field - No longer needed

What’s New

V3 adds new capabilities not available in V2:
  • Direct tag assignment - Use the tags parameter to tag documents on upload
  • Simplified status tracking - Filter files by upload_session_uuid to track batch uploads

Tracking Progress

V2: Track session status with GET /api/v2/upload-session/{uuid} V3: Filter files by upload session UUID:
# Get upload session UUID from upload response
upload_session_uuid = response.json()["upload_session_uuid"]

# Track all files in this batch
files = requests.get(
    f"{base_url}/api/v3/files",
    headers={"Authorization": f"Bearer {api_key}"},
    params={"upload_session_uuid": upload_session_uuid}
).json()["results"]
The V3 API is more intuitive and requires less code while maintaining the same reliability and performance as V2.