Agent Evaluation¶
Through Opsml's Scouter integration, Opsml provides you with tools to run offline LLM evaluations and comparisons. This is often useful for when you (1) want to compare and benchmark various prompts/agents, and (2) you want to evaluate different versions of prompts and agent services that you may already be using in production.