sharpbyte.dev
← Learning hub
AI ecosystem · LLMs

Large language models, explained clearly

Eleven beginner-friendly topics—from “what is an LLM?” through training, decoding, running models on your laptop, and Mixture of Experts. Each lesson adds context in plain language and uses diagrams from the reference material (watermarks removed).

Topic 1

What is an LLM?

If you can finish someone’s sentence, you already understand the core idea behind ChatGPT and every other large language model.

Read topic →
Topic 2

Need for LLMs

Before LLMs, every new AI feature often meant a brand-new model and pipeline. Today one general model can wear many hats.

Read topic →
Topic 3

What makes an LLM “large”?

“Large” is not marketing fluff—it usually means billions of parameters, huge datasets, and serious compute. Scale changes what the model can do.

Read topic →
Topic 4

How are LLMs built?

Under the hood: turn text into tokens, run them through a deep Transformer stack, and train billions of weights on a cluster of GPUs.

Read topic →
Topic 5

How to train an LLM from scratch

After the architecture exists, training turns random weights into a helpful assistant—in clear stages from pre-training through alignment and reasoning.

Read topic →
Topic 6

How do LLMs work?

At generation time the model assigns probabilities to every possible next token—then we choose one using sampling rules and knobs like temperature.

Read topic →
Topic 7

7 LLM generation parameters

APIs expose levers that shape the same next-token choice—learn what each dial does before you blame the model.

Read topic →
Topic 8

4 LLM text generation strategies

Predicting probabilities is only half the story—you still need a strategy to pick each next token.

Read topic →
Topic 9

3 techniques to train an LLM using another LLM

Big “teacher” models can train smaller “student” models—useful when you need speed and cost savings without starting from zero.

Read topic →
Topic 10

4 ways to run LLMs locally

Run models on your laptop for privacy, offline demos, and fast prompt iteration—without sending every test to the cloud.

Read topic →
Topic 11

Transformer vs. Mixture of Experts

MoE models can be huge on paper but only activate a slice of parameters per token—scaling capacity without paying full dense cost every time.

Read topic →
How to read this track: start at topic 1 if you are new; each page builds on ideas from earlier cards. Continue with Prompt engineering (eight topics from the same deck); RAG lands next in the same format.