Learning objectives
- • Explain self-attention, positional encoding, and multi-head attention
- • Contrast encoder-decoder setups with decoder-only LLMs
- • Reason about context length, memory, and inference cost
Deep Learning
Understand the transformer stack deeply enough to explain scaling, context handling, and attention trade-offs.
Learning objectives
Interview prompts
Prerequisites