Track 1 · LLMs · Blue
LLMs for application builders
Everything you need before RAG or agents: how transformers work at inference time, how to call models safely, and how tokens and context windows drive cost and reliability. Read the guides in order—later chapters assume you can run a minimal chat completion and estimate token cost.
Guides in this track
Five deep-dive chapters plus this overview. All guides are live—read in order.
Reading order: LLMs explained → APIs & tokens → Embeddings → Multimodal → Model selection & cost
-
01
LLMs explained
Predict next token, transformer intuition, temperature, context windows, training pipeline, model families, first API call.
-
02
APIs, tokens & context
OpenAI, Claude, Bedrock, Gemini; token counting; structured output; error handling.
-
03
Embeddings & semantic search
Dense vectors, cosine similarity, embedding models, ANN indexes, caching.
-
04
Multimodal — vision, audio & documents
Image input, PDF pipelines, OCR, Whisper, structured document extraction.
-
05
Model selection & cost optimization
Model cascades, prompt caching, batch API, local inference, cost templates.