How do you investigate an agent incident when outputs are non-deterministic?

Question

Accepted Answer

Investigating non-deterministic agent incidents starts with comprehensive logging of all inputs, intermediate steps, and environmental context for each interaction. This allows for retrospective analysis of the agent's internal thought process, even if the final output varies. We then perform statistical pattern detection across multiple incidents, identifying commonalities in problematic inputs, external data, or system states that correlate with undesirable outputs. Hypothesis testing with controlled experiments is crucial, where specific variables are isolated and manipulated to pinpoint the root cause of variability or errors. Additionally, human evaluation of a sample of divergent outputs helps categorize the nature of the non-determinism, distinguishing between acceptable variations and critical failures. Finally, comparing agent versions and their performance metrics over time provides insight into when and why specific behaviors emerged.