Tracing, evaluation, and iterative improvement loops for agent runs using Opik from the deck’s examples (PDF 256–260).
It breaks down handshakes, misconceptions and real examples and shows exactly how to start building. Agent optimization with Opik Developers manually iterate through prompts to find an optimal one. This is not scalable and performance can degrade across models. Let’s learn how to use the Opik Agent Optimizer toolkit that lets you automatically optimize prompts for LLM apps. The idea is to start with an initial prompt and an evaluation dataset, and let an LLM iteratively improve the prompt based on evaluations.
To begin, install Opik and its optimizer package, and configure Opik:
Next, import all the required classes and functions from opik and opik_optimizer:
● LevenshteinRatio → Our metric to evaluate the prompt’s effectiveness in generating a precise output for the given input.
● MetaPromptOptimizer → An algorithm that uses a reasoning model to critique and iteratively refine your initial instruction prompt.
● tiny_test → A basic test dataset with input-output pairs.
Next, define an evaluation dataset:
Moving on, configure the evaluation metric, which tells the optimizer how to score the LLM’s outputs against the given label:
Next, define your base prompt, which is the initial instruction that the MetaPromptOptimizer will try to enhance:
Next, instantiate a MetaPromptOptimizer, specifying the model to use in the optimization process:
Finally, the optimizer.optimize_prompt(...) method is invoked with the dataset, metric configuration, and prompt to start the optimization process:
It starts by evaluating the initial prompt, which sets the baseline:
Then it iterates through several different prompts (written by AI), evaluates them,
and prints the most optimal prompt. You can invoke result.display() to see a summary of the optimization, the best prompt found and its score:
The optimization results are also available in the Opik dashboard for further analysis and visualization:
And that’s how you can use Opik Agent Optimizer to enhance the performance and efficiency of your LLM apps. Note: While we used GPT-4o, everything here can be executed 100% locally since you can use any other LLM + Opik is fully open-source.