Day 50 of 133

NLP foundations: tokenization & embeddings

BPE/WordPiece/SentencePiece. word2vec → BERT contextual embeddings.

DSA · NeetCode Backtracking

Letter Combinations OF A Phone NumberDSA · Backtracking
Interview questions to prep
1. Walk through your pruning strategy — what subtrees do you skip and why is it safe?
2. Where does memoization apply? Could this be a DP problem in disguise?
3. What's the worst-case time complexity, and what's the depth of the recursion stack?
N QueensDSA · Backtracking
Interview questions to prep
1. How do you check 'queen attacks me' in O(1) using the diagonal-set trick?
2. What's the state-space size, and how much does pruning actually save in practice?

Tokenization: BPE, WordPiece, SentencePiece, UnigramDeep LearningHF
Interview questions to prep
1. Compare BPE, WordPiece, and SentencePiece tokenizers.
2. Why does tokenizer choice affect cross-lingual performance?
Word2Vec, GloVe, FastText, contextual embeddingsDeep LearningJay Alammar
Interview questions to prep
1. Walk through how word2vec (skip-gram) is trained.
2. How are contextual embeddings (BERT) different from static ones (word2vec)?
3. What problem does negative sampling solve in skip-gram training?
4. Compare CBOW, skip-gram, and GloVe — what objective is each optimizing?
5. When would n-grams or TF-IDF outperform dense embeddings in a production baseline?
Classic NLP tasks (NER, POS, SRL, parsing)Deep LearningCS224n
Interview questions to prep
1. Compare token-level (NER) vs sequence-level (classification) tasks.
2. Why is dependency parsing harder than POS tagging, and where does it still matter today?
Classic text features: BoW, TF-IDF, stemming, lemmatizationDeep Learningscikit-learn
Interview questions to prep
1. Compare bag-of-words and TF-IDF for sentiment analysis — what signal does TF-IDF add?
2. When would stemming hurt compared with lemmatization?
3. How would you build a strong non-LLM NLP baseline before using transformers?

References & further reading