Day 113 of 133

Design enterprise RAG (Glean-style) + DSA review

Multi-source ingestion; permission-aware retrieval; offline+online eval.

DSA · NeetCode Trees

Maximum Depth OF Binary TreeDSA · Trees
Interview questions to prep
1. Compare BFS vs DFS for this problem — which fits, and what's the iterative version?
2. What's the recursion's space cost on the stack, and how would you go iterative if you needed O(log n)?
3. What's the relationship between this problem's invariant and the BST property (if any)?

Design enterprise RAG over docs (Glean-style)ML System DesignGlean
Interview questions to prep
1. Walk me through designing an enterprise RAG over Confluence + Slack + Drive.
2. How do you handle access control / permissions in retrieval?
3. How would you handle 50M docs and 10k QPS?
Eval & monitoring for enterprise RAGML System DesignRagas
Interview questions to prep
1. How would you build an offline + online eval pipeline for an enterprise RAG?
2. What synthetic golden set would you generate for a domain where humans can't easily score answers?
Mixed enterprise docs: SharePoint, Jira, Slack, DriveML System DesignGlean
Interview questions to prep
1. How would you ingest SharePoint, Jira, Slack, and Drive while preserving permissions and freshness?
2. What metadata schema would you attach to chunks so retrieval can enforce ACLs and route by source?
3. How do you backfill 50M documents without breaking freshness for newly edited docs?
Implement heading-aware markdown chunk retrievalML System DesignInterview coding
Interview questions to prep
1. Implement retrieve_relevant_chunks(markdown, query) that preserves H1/H2/H3 hierarchy in returned chunks.
2. How would you score headings plus body text so a section title can match even when the paragraph uses different wording?
3. What edge cases break naive markdown chunking: tables, code blocks, duplicate headings, or very long sections?

References & further reading