Skip to main content

How does the document processing pipeline work?

Written by amaise Support

How does the document processing pipeline work?

Documents go through a defined pipeline with multiple stages at amaise:

CREATED → OCR → SEGMENTATION → SPLITTING → INDEXING → EXTRACTION → ANALYSIS → ANSWERING → READY

Key features:

  • Idempotent workers: Each stage is handled by a standalone, stateless worker. Processing can be safely repeated in case of errors.

  • Asynchronous communication: Workers communicate via message queues (SQS). Each worker processes one task at a time.

  • Tenant separation: Each task is assigned to a specific tenant. The same tenant isolation controls apply as in the rest of the application.

  • Encrypted storage: Documents are stored in S3 with tenant-specific encryption keys.

Did this answer your question?