MCPMark Verified
Stabilized, version-pinned MCPMark task set for reproducible MCP tool-use evaluation.
MCPMark Verified is the stabilized default task set of MCPMark, a comprehensive stress-testing MCP benchmark/evaluation suite for evaluating model and agent capabilities in real-world MCP tool use. The Verified set pins each environment to fixed MCP server versions and stabilizes verifiers across the standard tasks for reproducible evaluation.
Recent stories
0 linked stories
No linked stories yet.