Open-source benchmark software that evaluates whether coding agents can rebuild programs from scratch given only a compiled executable and documentation.
Recent stories
1 linked story
Open-source benchmark software that evaluates whether coding agents can rebuild programs from scratch given only a compiled executable and documentation.