Day 26 of 133

Feature engineering deep dive + DSA Linked List finish

Scaling, encoding, missing values, feature interactions.

DSA · NeetCode Linked List

Reverse Nodes IN K GroupDSA · Linked List
Interview questions to prep
1. Walk through your pointer hazards — what breaks if you lose track of the head or a prev pointer?
2. Can you do this in-place (O(1) extra space)? What's the trick?
3. How would you detect / handle a cycle, and prove your method's correctness?

Scaling, log/Box-Cox, binningTraditional MLscikit-learn
Interview questions to prep
1. When do tree models NOT need feature scaling, and when do they (gradient boosting libraries with regularization)?
2. When would you apply a log transform vs Box-Cox?
One-hot, target, frequency, hashing encodersTraditional MLcategory-encoders
Interview questions to prep
1. Compare one-hot, target, frequency, and hashing encoders — trade-offs in cardinality and leakage.
2. Why is target encoding leak-prone and how does k-fold target encoding fix it?
Missing data: deletion, imputation, indicatorsTraditional MLscikit-learn
Interview questions to prep
1. When is mean/median imputation harmful?
2. Why do tree models often handle missing values natively while linear models cannot?
Outliers, multicollinearity, and distribution shiftsTraditional MLscikit-learn
Interview questions to prep
1. How do you detect and handle outliers using box plots, robust scaling, winsorization, or model choice?
2. Why does multicollinearity destabilize linear models, and why are tree models less sensitive?
3. How would you distinguish true data drift from a one-off outlier spike in production?

References & further reading