Day 26 of 133
Feature engineering deep dive + DSA Linked List finish
Scaling, encoding, missing values, feature interactions.
DSA · NeetCode Linked List
- Reverse Nodes IN K GroupDSA · Linked List
Interview questions to prep
- Walk through your pointer hazards — what breaks if you lose track of the head or a prev pointer?
- Can you do this in-place (O(1) extra space)? What's the trick?
- How would you detect / handle a cycle, and prove your method's correctness?
ML · Feature engineering
Interview questions to prep
- When do tree models NOT need feature scaling, and when do they (gradient boosting libraries with regularization)?
- When would you apply a log transform vs Box-Cox?
Interview questions to prep
- Compare one-hot, target, frequency, and hashing encoders — trade-offs in cardinality and leakage.
- Why is target encoding leak-prone and how does k-fold target encoding fix it?
Interview questions to prep
- When is mean/median imputation harmful?
- Why do tree models often handle missing values natively while linear models cannot?
Interview questions to prep
- How do you detect and handle outliers using box plots, robust scaling, winsorization, or model choice?
- Why does multicollinearity destabilize linear models, and why are tree models less sensitive?
- How would you distinguish true data drift from a one-off outlier spike in production?
References & further reading
- scikit-learn user guide ↗scikit-learn
- Eugene Yan — applied ML writing ↗Eugene Yan