MCPMark Verified
Stabilized, version-pinned MCPMark task set for reproducible MCP tool-use evaluation.
MCPMark Verified is the stabilized, version-pinned default standard task set for MCPMark, an open-source evaluation suite/benchmark for testing agentic models in real Model Context Protocol tool environments such as Notion, GitHub, Filesystem, Postgres, and Playwright.
Recent stories
0 linked stories
No linked stories yet.