sharpbyte.dev
← RAG
RAG · topic 2 of 13

Vector databases & embeddings

Why embeddings cluster by meaning—and how vector stores enable similarity search over unstructured data.

From unstructured data to vectors

With RAG, the language model can use the retrieved information (which is expected to be reliable) from the vector database to ensure that its responses are grounded in real-world knowledge and context, reducing the likelihood of hallucinations. This makes the model's responses more accurate, reliable, and contextually relevant, while also ensuring that we don't have to train the LLM repeatedly on new data. This makes the model more "real-time" in its responses. To understand how RAG actually works in practice, we first need to understand vector databases - the storage layer that powers retrieval. What are vector databases? Simply put, a vector database stores unstructured data (text, images, audio, video, etc.) in the form of vector embeddings. Each data point, whether a word, a document, an image, or any other entity, is transformed into a numerical vector using ML techniques (which we shall see ahead). This numerical vector is called an embedding, and the model is trained in such a way that these vectors capture the essential features and characteristics of the underlying data.

How RAG couples an LLM with a vector store so answers stay grounded in indexed documents.
How RAG couples an LLM with a vector store so answers stay grounded in indexed documents.

Considering word embeddings, for instance, we may discover that in the embedding space, the embeddings of fruits are found close to each other, which cities form another cluster, and so on. This shows that embeddings can learn the semantic characteristics of entities they represent (provided they are trained appropriately). Once stored in a vector database, we can retrieve original objects that are similar to the query we wish to run on our unstructured data. In other words, encoding unstructured data allows us to run many sophisticated operations like similarity search, clustering, and classification over it, which otherwise is difficult with traditional databases. To exemplify, when an e-commerce website provides recommendations for similar items or searches for a product based on the input query, we’re (in most cases) interacting with vector databases behind the scenes.

Embeddings place similar meaning nearby in vector space (e.g., clusters of related concepts).
Embeddings place similar meaning nearby in vector space (e.g., clusters of related concepts).
Similarity search over embeddings powers recommendations, semantic search, and retrieval backends.
Similarity search over embeddings powers recommendations, semantic search, and retrieval backends.

Key takeaways

  • Embeddings turn text (or other modalities) into vectors that capture semantic similarity.
  • Vector DBs power nearest-neighbor retrieval for RAG and recommendation-style workloads.