ML Interview Roadmap

LLMOps

LLM Evaluation Operations

Build operational eval suites with golden datasets, adversarial tests, trace grading, human review, and online feedback loops.

Recommended on day 5395 minutesAdvanced

Learning objectives

• Create eval datasets that represent real and adversarial usage
• Grade retrieval, tool use, reasoning traces, final answers, latency, and cost
• Use eval gates for prompt, model, and retrieval releases

Interview prompts

• How do you stop prompt changes from regressing existing customers?
• What belongs in a trace-level evaluation?

Prerequisites

LLM Evaluation and Guardrails RAG Architecture