Day 115 of 133
Design LLM eval platform (Braintrust/LangSmith) + DSA review
Slicing dimensions; PII redaction; continuous online eval.
DSA · NeetCode Advanced Graphs
- Min Cost TO Connect All PointsDSA · Advanced Graphs
Interview questions to prep
- Pick between Dijkstra, Bellman-Ford, Floyd-Warshall, MST (Prim/Kruskal), or topo sort — defend the choice.
- What does this problem assume about edge weights (non-negative? integer? bounded?) — and what breaks if those don't hold?
- Walk me through complexity in V and E, and the data-structure choice (heap vs Fibonacci heap vs array).
ML System Design · LLM eval platform
Interview questions to prep
- Walk me through designing an internal LLM eval platform.
- What dimensions would you slice eval results by?
Interview questions to prep
- How would you continuously evaluate a production LLM agent on real traffic without leaking PII?
- How do you keep eval datasets fresh as user behavior shifts — what's the refresh cadence?
References & further reading
- LangSmith — LLM tracing & eval ↗LangChain
- Ragas metrics catalog ↗Ragas
- Anthropic — Testing & Evaluation ↗Anthropic