Day 48 of 133

Generative vision (VAE, GAN, Diffusion / Stable Diffusion)

ELBO, mode collapse, DDPM forward/reverse, latent diffusion.

DSA · NeetCode Backtracking

Combination Sum IIDSA · Backtracking
Interview questions to prep
1. Walk through your pruning strategy — what subtrees do you skip and why is it safe?
2. Where does memoization apply? Could this be a DP problem in disguise?
3. What's the worst-case time complexity, and what's the depth of the recursion stack?
Word SearchDSA · Backtracking
Interview questions to prep
1. Walk through DFS with a 'visited' marker on the board (in-place vs aux). Trade-offs?
2. How does this scale to 'word search ii' with a trie of many target words?

Variational AutoencodersDeep LearningLilian Weng
Interview questions to prep
1. Walk through VAE's evidence lower bound (ELBO).
2. Why is the reparameterization trick necessary?
GANs: generator/discriminator, mode collapseDeep LearningLilian Weng
Interview questions to prep
1. What is mode collapse in GANs and what fixes it?
2. Walk me through Wasserstein GAN — why does the new loss stabilize training?
Diffusion models (DDPM, latent diffusion)Deep LearningLilian Weng
Interview questions to prep
1. Walk me through the forward and reverse processes in DDPM.
2. What does latent diffusion (Stable Diffusion) do differently?
3. How does text conditioning enter Stable Diffusion through the text encoder and cross-attention?
4. Compare text-to-image and text-to-video diffusion — what new temporal consistency problems appear?
5. Where would you optimize a diffusion product first: denoising steps, scheduler, latent resolution, batching, or distillation?

References & further reading