Assertion gates
Understanding Dependencies vs Conditional Gates
Section titled “Understanding Dependencies vs Conditional Gates”Scouter’s evaluation framework supports two types of task relationships:
Task Dependencies (Default Behavior)
Section titled “Task Dependencies (Default Behavior)”Task dependencies allow downstream tasks to consume results from upstream tasks, regardless of whether the upstream task passed or failed. The downstream task executes as long as the dependency completes.
flowchart LR
A["LLM Judge<br/><b>empathy_assessment</b>"]
B["Assertion<br/><b>empathy_score_threshold</b>"]
C["✓ Task Result"]
A -->|"Completes (pass or fail)"| B
B -->|"Executes using upstream result"| C
Example:
empathy_assessment = LLMJudgeTask( id="empathy_assessment", prompt=empathy_prompt, expected_value={"shows_empathy": True},)
empathy_score_threshold = AssertionTask( id="empathy_score_threshold", context_path="empathy_assessment.score", operator=ComparisonOperator.GreaterThanOrEqual, expected_value=5, depends_on=["empathy_assessment"],)Conditional Gates (condition=True)
Section titled “Conditional Gates (condition=True)”Conditional gates act as control flow mechanisms. Downstream tasks only execute if the gate passes. If the gate fails, all dependent tasks are skipped.
flowchart LR
D["Gate<br/><b>input_validation</b>"]
E["Task<br/><b>process_data</b>"]
F["Task<br/><b>analyze_results</b>"]
G["✗ Skipped"]
D -->|"✓ Passes"| E
D -.->|"✗ Fails"| G
E --> F
Example:
input_validation = AssertionTask( id="input_validation", context_path="input.query", operator=ComparisonOperator.IsNotEmpty, expected_value=True, condition=True)
process_query = LLMJudgeTask( id="process_query", prompt=query_prompt, expected_value={"valid": True}, depends_on=["input_validation"],)Key Differences
Section titled “Key Differences”| Aspect | Data Dependency | Conditional Gate |
|---|---|---|
| Flag | condition=False (default) | condition=True |
| Behavior | Downstream task always executes after completion | Downstream task only executes if gate passes |
| Use Case | Pass results between tasks for processing | Control whether expensive operations run |
| Result Storage | Stored regardless of pass/fail | Only stored if task passes |
| Comparison Impact | Task appears in all workflow runs | Task may be missing from some runs |
Practical Example: Multi-Stage Evaluation
Section titled “Practical Example: Multi-Stage Evaluation”flowchart TB
subgraph s1["Stage 1: Parallel Evaluation"]
direction LR
A["LLM<br/><b>empathy</b>"]
B["Assertion<br/><b>acknowledges</b>"]
C["LLM<br/><b>accuracy</b>"]
end
subgraph s2["Stage 2: Conditional Logic"]
direction LR
D["Assertion<br/><b>empathy_threshold</b><br/>condition=false"]
E["Gate<br/><b>accuracy_gate</b><br/>condition=true"]
end
subgraph s3["Stage 3: Gated Operations"]
F["LLM<br/><b>deep_analysis</b>"]
G["✗ Skipped"]
end
A --> D
C --> E
E -->|"✓ Passes"| F
E -.->|"✗ Fails"| G
empathy_assessment = LLMJudgeTask( id="empathy_assessment", prompt=empathy_prompt, expected_value={"shows_empathy": True},)
technical_accuracy = LLMJudgeTask( id="technical_accuracy", prompt=accuracy_prompt, expected_value=True, ...)
empathy_score_threshold = AssertionTask( id="empathy_score_threshold", context_path="empathy_assessment.score", operator=ComparisonOperator.GreaterThanOrEqual, expected_value=5, depends_on=["empathy_assessment"],)
accuracy_gate = AssertionTask( id="accuracy_gate", context_path="technical_accuracy.is_accurate", operator=ComparisonOperator.Equals, expected_value=True, depends_on=["technical_accuracy"], condition=True)
deep_analysis = LLMJudgeTask( id="deep_analysis", prompt=deep_analysis_prompt, expected_value=True, depends_on=["accuracy_gate"], ...)Execution Flow:
- Stage 1:
empathy_assessmentandtechnical_accuracyrun in parallel - Stage 2:
empathy_score_thresholdalways runs (uses empathy score regardless of pass/fail)accuracy_gateevaluates the technical accuracy result
- Stage 3:
- If
accuracy_gatepasses:deep_analysisexecutes - If
accuracy_gatefails:deep_analysisis skipped (saves LLM cost)
- If