SWE-Marathon
Benchmarking and evaluation for coding agents
Open-source software product for long-horizon software engineering benchmarking and coding-agent evaluation.
Recent stories
0 linked stories
No linked stories yet.
Benchmarking and evaluation for coding agents
Open-source software product for long-horizon software engineering benchmarking and coding-agent evaluation.