Day 48 of 133
Generative vision (VAE, GAN, Diffusion / Stable Diffusion)
ELBO, mode collapse, DDPM forward/reverse, latent diffusion.
DSA · NeetCode Backtracking
- Combination Sum IIDSA · Backtracking
Interview questions to prep
- Walk through your pruning strategy — what subtrees do you skip and why is it safe?
- Where does memoization apply? Could this be a DP problem in disguise?
- What's the worst-case time complexity, and what's the depth of the recursion stack?
- Word SearchDSA · Backtracking
Interview questions to prep
- Walk through DFS with a 'visited' marker on the board (in-place vs aux). Trade-offs?
- How does this scale to 'word search ii' with a trie of many target words?
DL · Generative vision (VAE, GAN, Diffusion)
Interview questions to prep
- Walk through VAE's evidence lower bound (ELBO).
- Why is the reparameterization trick necessary?
Interview questions to prep
- What is mode collapse in GANs and what fixes it?
- Walk me through Wasserstein GAN — why does the new loss stabilize training?
Interview questions to prep
- Walk me through the forward and reverse processes in DDPM.
- What does latent diffusion (Stable Diffusion) do differently?
- How does text conditioning enter Stable Diffusion through the text encoder and cross-attention?
- Compare text-to-image and text-to-video diffusion — what new temporal consistency problems appear?
- Where would you optimize a diffusion product first: denoising steps, scheduler, latent resolution, batching, or distillation?
References & further reading
- Papers with Code — SOTA leaderboards ↗Papers with Code
- fast.ai — Practical Deep Learning ↗fast.ai
- Dive into Deep Learning (d2l.ai) ↗d2l.ai
- 75Hard GenAI/LLM — Text-to-image Stable Diffusion walkthrough ↗75Hard GenAI/LLM Challenge