Skip to main content
Help centre

Document intake · 5 min

Import and extract source documents

How to upload source files, move them through scan and extraction, and understand the honest status at each step.

Help baseline: 2026-06-15

document importfile uploadextraction pipelinescan statesource register

Upload a source file

Go to the document import page and use the upload panel to select one or more source files. Supported formats are PDF, DOCX, XLSX, CSV, and plain text.

  • Each file must be under 50 MB and in a supported format.
  • Files are uploaded directly to object storage; they are not sent via the browser.
  • The object key and SHA-256 hash are recorded immediately so lineage is traceable from upload.
  • After upload the file enters the scan queue automatically.

Scan and extraction pipeline

Each uploaded file moves through a two-stage pipeline: virus scan followed by text extraction. The status of each stage is visible per file in the import queue and source register.

  • Clean scan means the file passed the virus scanner and is ready for extraction.
  • Quarantined means the scan blocked the file — it cannot proceed to extraction until resolved.
  • Queued for extraction means the file cleared scanning and is waiting for the extraction worker.
  • Extracted into evidence means text was successfully extracted and is ready for human review.

Honest unable-to-extract states

When extraction cannot produce reviewable text, the source row shows an honest Unable to extract state rather than silently creating empty evidence.

  • Unable to extract means the extraction worker ran but found no text — no evidence was created from this file.
  • Unable to extract — scanned PDF means the PDF appears image-only and OCR did not produce text above the quality bar.
  • Both states block the source from entering the review queue until resolved by replacement or re-extraction.
  • Nothing is implied about the document's content when extraction fails — the honest state is always shown.

Boundary

DefenceFile help explains workflow operation. It does not provide legal advice, create privilege, certify scope, certify reasonable procedures, or guarantee that a statutory defence will succeed.

Request pilot review