Day 18 of 133
Cross-validation & data leakage + DSA Binary Search
k-fold, stratified, time-series CV, nested CV, leak prevention.
DSA · NeetCode Binary Search
- Binary SearchDSA · Binary Search
Interview questions to prep
- What's your binary-search template (left ≤ right, left < right) and which one are you using and why?
- Walk through the bug-prone bits: mid calculation overflow, off-by-one on bounds.
- Search A 2d MatrixDSA · Binary Search
Interview questions to prep
- State your loop invariant precisely — what must be true on every iteration?
- Why does the loop terminate, and how do you avoid infinite loops on the search-space update?
- Walk through edge cases: empty array, target smaller than min, target larger than max, duplicates.
- Koko Eating BananasDSA · Binary Search
Interview questions to prep
- State your loop invariant precisely — what must be true on every iteration?
- Why does the loop terminate, and how do you avoid infinite loops on the search-space update?
- Walk through edge cases: empty array, target smaller than min, target larger than max, duplicates.
ML · Cross-validation & evaluation
Interview questions to prep
- When does k-fold leak data, and what does TimeSeriesSplit do differently?
- Why is stratified k-fold important for imbalanced classification?
Interview questions to prep
- Why do you need both a validation and a test set for hyperparameter tuning?
- What is nested cross-validation and when is it worth the cost?
- How would your split strategy change for time-series forecasting vs random tabular rows?
Interview questions to prep
- Walk through three common ways data leakage sneaks into an ML pipeline.
- How would you build a pipeline that prevents leakage when scaling features?
References & further reading
- scikit-learn user guide ↗scikit-learn
- Andrew Ng — Machine Learning Specialization ↗Coursera
- NeetCode roadmap (full 250) ↗NeetCode