Eleven beginner-friendly topics—from “what is an LLM?” through training, decoding, running models on your laptop, and Mixture of Experts. Each lesson adds context in plain language and uses diagrams from the reference material (watermarks removed).
If you can finish someone’s sentence, you already understand the core idea behind ChatGPT and every other large language model.
Before LLMs, every new AI feature often meant a brand-new model and pipeline. Today one general model can wear many hats.
“Large” is not marketing fluff—it usually means billions of parameters, huge datasets, and serious compute. Scale changes what the model can do.
Under the hood: turn text into tokens, run them through a deep Transformer stack, and train billions of weights on a cluster of GPUs.
After the architecture exists, training turns random weights into a helpful assistant—in clear stages from pre-training through alignment and reasoning.
At generation time the model assigns probabilities to every possible next token—then we choose one using sampling rules and knobs like temperature.
APIs expose levers that shape the same next-token choice—learn what each dial does before you blame the model.
Predicting probabilities is only half the story—you still need a strategy to pick each next token.
Big “teacher” models can train smaller “student” models—useful when you need speed and cost savings without starting from zero.
Run models on your laptop for privacy, offline demos, and fast prompt iteration—without sending every test to the cloud.
MoE models can be huge on paper but only activate a slice of parameters per token—scaling capacity without paying full dense cost every time.