PostTrainBench
Measuring how well AI agents can post-train language models
Open benchmark and accompanying website for evaluating whether AI/CLI agents can post-train small base language models under a fixed compute and time budget, with leaderboard reporting across multiple benchmarks.
Recent stories
0 linked stories
No linked stories yet.