sharpbyte.dev
← AI agents
AI agents · topic 15 of 16

Agent optimization with Opik

Tracing, evaluation, and iterative improvement loops for agent runs using Opik from the deck’s examples (PDF 256–260).

Agent optimization with Opik

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

It breaks down handshakes, misconceptions and real examples and shows exactly how to start building. Agent optimization with Opik Developers manually iterate through prompts to find an optimal one. This is not scalable and performance can degrade across models. Let’s learn how to use the Opik Agent Optimizer toolkit that lets you automatically optimize prompts for LLM apps. The idea is to start with an initial prompt and an evaluation dataset, and let an LLM iteratively improve the prompt based on evaluations.

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

To begin, install Opik and its optimizer package, and configure Opik:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

Next, import all the required classes and functions from opik and opik_optimizer:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

● LevenshteinRatio → Our metric to evaluate the prompt’s effectiveness in generating a precise output for the given input.

● MetaPromptOptimizer → An algorithm that uses a reasoning model to critique and iteratively refine your initial instruction prompt.

● tiny_test → A basic test dataset with input-output pairs.

Next, define an evaluation dataset:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

Moving on, configure the evaluation metric, which tells the optimizer how to score the LLM’s outputs against the given label:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

Next, define your base prompt, which is the initial instruction that the MetaPromptOptimizer will try to enhance:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

Next, instantiate a MetaPromptOptimizer, specifying the model to use in the optimization process:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

Finally, the optimizer.optimize_prompt(...) method is invoked with the dataset, metric configuration, and prompt to start the optimization process:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

It starts by evaluating the initial prompt, which sets the baseline:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

Then it iterates through several different prompts (written by AI), evaluates them,

and prints the most optimal prompt. You can invoke result.display() to see a summary of the optimization, the best prompt found and its score:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

The optimization results are also available in the Opik dashboard for further analysis and visualization:

Illustration from the AI Agents chapter of the course deck.
Illustration from the AI Agents chapter of the course deck.

And that’s how you can use Opik Agent Optimizer to enhance the performance and efficiency of your LLM apps. Note: While we used GPT-4o, everything here can be executed 100% locally since you can use any other LLM + Opik is fully open-source.

Key takeaways

  • You cannot optimize what you cannot trace across tool calls and prompts.
  • Offline evals plus live telemetry close the loop on regressions.