GenAI Evaluation¶
Through Opsml's Scouter integration, Opsml provides you with tools to run offline LLM evaluations and comparisons. This is is often useful for when you (1) want to compare and benchmark various prompts/agents, and (2) you want to evaluate different versions of prompts and GenAI services that you may already be using in production.