The LLM Training Roadmap
From raw data to refined intelligence. How a messy internet scraper becomes a helpful assistant.
Pre-training (The Foundation)
Exposed to trillions of words. Playing "fill in the blank" with the entire internet.
Medical Student Analogy
"Think of a medical student reading every textbook ever written. They know all the biology, but don't know how to talk to a patient yet."
Residency Analogy
"This is the student's residency. They learn the specific task of being a doctor—diagnosing and giving instructions."
Fine-tuning
Specialization
Refined on high-quality data (Q&A, Instructions). Transforms a "guesser" into a "helper".
Safety Training (Guardrails)
Learning to refuse harmful requests and minimize bias.
Hippocratic Oath Analogy
"Hospital ethics training. Ensuring the powerful knowledge is used responsibly and safely."