Thirteen topics from the same reference PDF as LLMs, Prompt engineering, and Fine-tuning—vector stores, full RAG workflows, chunking, architecture patterns, HyDE, REFRAG, cache-augmented generation, and the path to agent memory.
Retrieval, augmentation, and generation—grounding LLMs without retraining on every update.
Why embeddings cluster by meaning—and how vector stores enable similarity search over unstructured data.
Static training corpora, private data, context windows—and where vector memory fits.
Chunk, embed, store, query, retrieve, re-rank, and generate—end-to-end pipeline anatomy.
Fixed-size, semantic, recursive, structure-aware, and LLM-driven chunking—trade-offs and when to test.
Choose among steering, retrieval, weight updates—or hybrid RAG + fine-tuning—using knowledge vs adaptation axes.
Naive, multimodal, HyDE, corrective, graph, hybrid, adaptive, and agentic patterns—at a glance.
Agents rewrite queries, choose sources, iterate, and self-check—beyond one-shot retrieve-and-read.
Hypothetical document embeddings align queries with answer-like text for better dense retrieval.
Three ways to add knowledge: full weights, adapters, or retrieval—pros, costs, and limits.
Meta’s relevance-aware pipeline: compress chunks, RL-filter, selectively expand before the decoder.
Cache stable knowledge in KV memory; keep volatile facts on the retrieval path.
Read-only retrieval → tool-mediated retrieval → read/write memory for personalization and continual learning.