Day 53 of 133

Pretrained models — BERT, T5, GPT + DSA Graphs

Encoder-only vs encoder-decoder vs decoder-only; MLM vs causal LM.

DSA · NeetCode Graphs

  • Surrounded RegionsDSA · Graphs

    Interview questions to prep

    1. Is this BFS, DFS, or Union-Find? Defend the choice over the other two.
    2. Walk through complexity in terms of V and E. Where do those costs come from?
    3. How would you handle disconnected components, self-loops, or duplicate edges?
  • Rotting OrangesDSA · Graphs

    Interview questions to prep

    1. Is this BFS, DFS, or Union-Find? Defend the choice over the other two.
    2. Walk through complexity in terms of V and E. Where do those costs come from?
    3. How would you handle disconnected components, self-loops, or duplicate edges?

NLP · Pretrained models (BERT, T5, GPT)

  • BERT: MLM + NSP, encoder-onlyDeep LearningDevlin et al.

    Interview questions to prep

    1. Walk through BERT's MLM and NSP objectives.
    2. Why is BERT bidirectional while GPT is left-to-right?
    3. How would you fine-tune BERT for token classification (NER)?
    4. Why is masking used in masked-language-model training, and how is it different from causal masking?
    5. How would you fine-tune DistilBERT for IMDB sentiment and check whether compression hurt quality?
  • T5 / BART: encoder-decoderDeep LearningRaffel et al.

    Interview questions to prep

    1. What is T5's text-to-text framing, and what does it enable?
    2. Compare BART vs T5 for summarization.
  • GPT: decoder-only, causal LMDeep LearningOpenAI

    Interview questions to prep

    1. Why did decoder-only models win the LLM race?
    2. What does the causal-LM objective give you that masked-LM doesn't, and vice versa?
    3. Explain teacher forcing during training and autoregressive decoding during inference.

References & further reading