AI Safety & Agent EvaluationIntermediateAvailable

Measuring Agent Reliability in Long-Running Tasks

Wednesday, June 10, 2026 · 3:30 PM – 4:30 PM · Hall D1

How to design metrics that capture multi-step success, partial credit, and recovery behavior for agents that run for hours.

Topic
Reliability
Time of day
Afternoon
Speaker
Ben Carter
Company
Sentinel AI
Room
Hall D1
Session ID
S020