AI Safety & Agent EvaluationIntermediateAvailable
Measuring Agent Reliability in Long-Running Tasks
Wednesday, June 10, 2026 · 3:30 PM – 4:30 PM · Hall D1
How to design metrics that capture multi-step success, partial credit, and recovery behavior for agents that run for hours.
Topic
Reliability
Time of day
Afternoon
Speaker
Ben Carter
Company
Sentinel AI
Room
Hall D1
Session ID
S020