service
community involvement, organizing, and reviewing.
Competitions & shared evaluations
Organizer & judge, Sierra τ²-Bench Custom Track — AgentX–AgentBeats Competition Berkeley RDI · Fall 2025 – Spring 2026
A two-phase, public competition hosted by Berkeley RDI in conjunction with the Agentic AI MOOC, aimed at building shared, reproducible benchmarks for agentic AI and the agents that compete on them. The Sierra-sponsored τ²-Bench Custom Track challenges teams to build purple agents that perform well on τ²-Bench — our dual-control benchmark for agents that must execute complex tool-using tasks while sustaining long, coherent conversational interactions, in real-world customer-service domains.