Context engineering · topic 3 of 5

Build a Context Engineering workflow

Reference pipeline: ingest, memory, web and paper search, filter, kickoff, and Streamlit demo (PDF 159–168).

Build a Context Engineering workflow

Build a Context Engineering workﬂow We'll build a multi-agent research assistant using context engineering principles. This Agent will gather its context across 4 sources: Documents, Memory, Web search, and Arxiv. Here’s our workﬂow:

Reference multi-agent research flow: documents, memory, web, arXiv, aggregation, filtering, and response.

● User submits query.

● Fetch context from docs, web, arxiv API, and memory.

● Pass the aggregated context to an agent for ﬁltering.

● Pass the ﬁltered context to another agent to generate a response.

● Save the ﬁnal response to memory.

Tech stack:

● Tensorlake to get RAG-ready data from complex docs

● Zep for memory

● Firecrawl for web search

● Milvus for vector DB

● CrewAI for orchestration

Let's go!

Example stack from the deck: Tensorlake, Zep, Firecrawl, Milvus, and CrewAI.

CE involves creating dynamic systems that oﬀer:

● The right info

● The right tools

● In the right format

This ensures the LLM can eﬀectively complete the task.

#1) Crew ﬂow

We'll follow a top-down approach to understand the code. Here's an outline of what our ﬂow looks like: Note that this is one of many blueprints to implement a context engineering workﬂow. Your pipeline will likely vary based on the use case.

Top-down Crew flow blueprint for the context engineering pipeline (one of many possible implementations).

#2) Prepare data for RAG

Tensorlake turns source documents into RAG-ready markdown chunks per section.

We use Tensorlake to convert the document into RAG-ready markdown chunks for each section. The extracted data can be directly embedded and stored in a vector DB without further processing.

#3) Indexing and retrieval

Now that we have RAG-ready chunks along with the metadata, it's time to store them in a self-hosted Milvus vector database. We retrieve the top-k most similar chunks to our query:

Indexing and retrieval: embed chunks with metadata in Milvus and fetch top-k for the query.

#4) Build memory layer

Zep acts as the core memory layer of our workﬂow. It creates temporal knowledge graphs to organize and retrieve context for each interaction. We use it to store and retrieve context from chat history and user data.

Memory layer: Zep builds temporal knowledge graphs over chat and user data.

#5) Firecrawl web search

We use Firecrawl web search to fetch the latest news and developments related to the user query. Firecrawl's v2 endpoint provides 10x faster scraping, semantic crawling, and image search, turning any website into LLM-ready data.

Live web context via Firecrawl (fast scrape and LLM-ready pages).

#6) ArXiv API search

To further support research queries, we use the arXiv API to retrieve relevant results from their data repository based on the user query.

Research papers through the arXiv API to broaden beyond crawled web text.

#7) Filter context

Now, we pass our combined context to the context evaluation agent that ﬁlters out irrelevant context. This ﬁltered context is then passed to the synthesizer agent that generates the ﬁnal response.

Filter stage: an evaluation agent drops irrelevant context before a synthesizer writes the final answer.

#8) Kick oﬀ the workﬂow

Finally, we kick oﬀ our context engineering workﬂow with a query. Based on the query, we notice that the RAG tool, powered by Tensorlake, was the most relevant source for the LLM to generate a response.

Kickoff example: which tool (e.g., RAG via Tensorlake) carried most of the signal for the answer.

We also translated this workﬂow into a streamlit app that:

Streamlit UI sketch: citations, metadata, and per-source transparency for debugging the workflow.

● Provides citations with links and metadata.

● Provides insights into relevant sources.

The workﬂow explained above is one of the many blueprints. Your implementation can vary.

Key takeaways

Production workflows chain document prep, vector/RAG, memory, search, and a filter step.
Shipping a UI makes citations and intermediate context legible for debugging.