Blogs

Human in the Loop Document Review UI (Architecture on AWS and MongoDB)

Hero image for: Human in the Loop Document Review UI (Architecture on AWS and MongoDB)
Shawn Wilborne
August 27, 2025
4
min read

Human in the Loop Document Review UI (Architecture on AWS and MongoDB)

Key Takeaways

  • Human in the loop review is the fastest way to reach production reliability.
  • Keep the UI focused on exceptions, not full document editing.
  • Store provenance and audit logs so you can defend every value.
  • Use validation gates to control review workload.

Fully automated document extraction is rare in real operations. The difference between a pilot and a production system is usually a clean human in the loop layer that handles exceptions quickly.

This post describes a pragmatic review UI architecture using AWS for storage and compute, and MongoDB for workflow state.

Why human in the loop is a feature, not a failure

Human review is what lets you:

  • Ship sooner, before you have perfect training data
  • Control risk on sensitive fields
  • Improve continuously through feedback loops

What the review UI should actually do

A useful review UI is focused:

  • Show the original page and the extracted fields side by side
  • Highlight citations and bounding boxes
  • Let reviewers correct fields with minimal clicks
  • Capture reason codes for failures
  • Support queues and assignments

Reference architecture

Storage

  • Raw docs in S3
  • Normalized page images in S3
  • Evidence and redlines in S3

S3 reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html

API and workflow

  • Step Functions for orchestration
  • API service for task assignment and updates

References:

MongoDB collections

  • review_tasks: status, assignedTo, dueAt
  • extracted_fields: field values, confidence, citations
  • review_actions: diffs, timestamps, reviewer ID
  • audit_log: append only

MongoDB change streams are useful to push UI updates:

Quality gates and confidence thresholds

Two simple gates reduce review volume:

  • Only send to review when a required field is missing or below confidence threshold
  • Auto approve when validation rules pass

Security and access control

Review systems often contain PII.

Baseline controls:

  • Strong authentication and RBAC
  • Per tenant authorization checks
  • Encrypted storage and short lived signed URLs

References:

zion OCR demo: https://ocr.lidvizion.ai/

FAQs

Q: How many reviewers do we need? Start by measuring the exception rate. A good system reduces review volume over time through validation and model improvements.

Q: Can we use reviewers to improve models? Yes. Store corrected values with citations and use them as labeled data for retraining or rule refinement.

Q: How do we prevent reviewers from seeing data they should not? Enforce tenant checks in the API, and limit S3 signed URL scope and lifetime.

Internal reference:

  • Lid Vi

Q: What should we link to internally? A: Link to relevant solution pages like Computer Vision or Document Intelligence, and only link to published blog URLs on the main domain. Avoid staging links.

Written By
Shawn Wilborne
AI Builder