Day 116 of 133

Design document intelligence (OCR + LLM extraction) + DSA review

Hybrid OCR + vision-LM extraction; confidence scoring; HITL review.

DSA · NeetCode 1-D DP

  • Interview questions to prep

    1. State the DP: define the state, the transition, and the base case explicitly.
    2. Top-down (memoized recursion) vs bottom-up (tabulation) — which is more natural here, and why?
    3. Can you space-optimize from O(n) to O(1)? Show the rolling-window trick.

ML System Design · Document intelligence

  • Interview questions to prep

    1. Walk me through designing a system that ingests invoices/contracts and extracts structured fields.
    2. How would you handle hand-written or low-quality scans?
  • Eval & confidence scoringML System DesignAnthropic

    Interview questions to prep

    1. How would you flag low-confidence extractions for human review?
    2. What's the unit cost of a human-review pass, and how do you optimize the routing threshold?
  • Interview questions to prep

    1. How would you process a document set that mixes PDFs, scanned images, tables, email attachments, and OCR errors?
    2. How do you validate extracted fields when the source document has conflicting values?
    3. When do you use vision-language models directly vs OCR + layout parsing + LLM extraction?
    4. When would an OCR-free document model beat a hybrid OCR + layout + LLM pipeline?
    5. How would you compare DocOwl-style OCR-free extraction against a conventional OCR pipeline on accuracy, latency, and cost?

References & further reading