Day 116 of 133
Design document intelligence (OCR + LLM extraction) + DSA review
Hybrid OCR + vision-LM extraction; confidence scoring; HITL review.
DSA · NeetCode 1-D DP
- Min Cost Climbing StairsDSA · 1-D DP
Interview questions to prep
- State the DP: define the state, the transition, and the base case explicitly.
- Top-down (memoized recursion) vs bottom-up (tabulation) — which is more natural here, and why?
- Can you space-optimize from O(n) to O(1)? Show the rolling-window trick.
ML System Design · Document intelligence
Interview questions to prep
- Walk me through designing a system that ingests invoices/contracts and extracts structured fields.
- How would you handle hand-written or low-quality scans?
Interview questions to prep
- How would you flag low-confidence extractions for human review?
- What's the unit cost of a human-review pass, and how do you optimize the routing threshold?
Interview questions to prep
- How would you process a document set that mixes PDFs, scanned images, tables, email attachments, and OCR errors?
- How do you validate extracted fields when the source document has conflicting values?
- When do you use vision-language models directly vs OCR + layout parsing + LLM extraction?
- When would an OCR-free document model beat a hybrid OCR + layout + LLM pipeline?
- How would you compare DocOwl-style OCR-free extraction against a conventional OCR pipeline on accuracy, latency, and cost?
References & further reading
- Anthropic — Building Effective Agents ↗Anthropic
- Eugene Yan — applied ML writing ↗Eugene Yan