BrowseComp-Plus
A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent
BrowseComp-Plus is an OpenAI browsing-agent evaluation benchmark used to measure an AI agent's ability to find hard-to-locate information on the web and return precise answers. It is best treated as an evaluation tool/benchmark rather than an end-user SaaS product.
Recent stories
0 linked stories
No linked stories yet.