Day 56 of 133

Deep learning wrap + DSA Graphs finish

Re-record CV + NLP breadth answers; identify shaky topics for re-study.

DSA · NeetCode Graphs

  • Interview questions to prep

    1. Is this BFS, DFS, or Union-Find? Defend the choice over the other two.
    2. Walk through complexity in terms of V and E. Where do those costs come from?
    3. How would you handle disconnected components, self-loops, or duplicate edges?
  • Graph Valid TreeDSA · Graphs

    Interview questions to prep

    1. Is this BFS, DFS, or Union-Find? Defend the choice over the other two.
    2. Walk through complexity in terms of V and E. Where do those costs come from?
    3. How would you handle disconnected components, self-loops, or duplicate edges?
  • Word LadderDSA · Graphs

    Interview questions to prep

    1. Why BFS over DFS for shortest path?
    2. What's the trick with the wildcard pattern (e.g., 'h*t') to build neighbours efficiently?

DL · Neural network foundations

  • Interview questions to prep

    1. Walk me through forward pass through a 2-layer MLP for binary classification.
    2. Why can't a single perceptron solve XOR — and how does adding a hidden layer fix it?
  • Interview questions to prep

    1. Compare ReLU, Leaky ReLU, GELU, and SwiGLU — when does each shine?
    2. Why did ReLU largely replace sigmoid/tanh in deep networks?
    3. What is the dying ReLU problem and how do you mitigate it?
  • Interview questions to prep

    1. Why does poor initialization cause vanishing or exploding gradients?
    2. Compare Xavier vs He initialization — which goes with which activation and why?

DL · ViT, CLIP, multimodal

  • Vision Transformer (ViT)Deep LearningGoogle

    Interview questions to prep

    1. How does ViT tokenize an image, and what's the role of the [CLS] token?
    2. When does a ViT beat a CNN, and when does data-hungriness hurt it?
  • Interview questions to prep

    1. How does CLIP enable zero-shot image classification?
    2. Walk me through CLIP's contrastive training objective.
  • Interview questions to prep

    1. How do multimodal LLMs like LLaVA fuse vision encoders with language models?
    2. Compare early fusion vs late fusion in vision-language models — what does each cost in compute and quality?

DL · Attention & Transformer

  • Self-attention(Q, K, V) end-to-endDeep LearningJay Alammar

    Interview questions to prep

    1. Walk me through self-attention(Q, K, V) end-to-end.
    2. Why divide by √d_k inside softmax?
    3. What does multi-head attention buy you over a single head?
  • Interview questions to prep

    1. Walk me through the transformer block: attention → add+norm → FFN → add+norm.
    2. Compare absolute vs relative vs RoPE positional encodings.
  • Sparse, linear, FlashAttention, MQA, GQADeep LearningFlashAttention

    Interview questions to prep

    1. What problem does FlashAttention solve, and how?
    2. Compare MHA, MQA, and GQA — KV-cache trade-offs.
  • Interview questions to prep

    1. Implement scaled dot-product self-attention and track the shapes of Q, K, V, scores, and output.
    2. How do masks change self-attention for causal language modeling?
    3. What changes when you split attention into multiple heads and then concatenate them?

References & further reading