BullshitBench
Benchmark for bullshit-free AI answers
A benchmark-style software product for evaluating how well AI systems avoid bullshit and unsupported answers.

Recent stories
0 linked stories
No linked stories yet.
Benchmark for bullshit-free AI answers
A benchmark-style software product for evaluating how well AI systems avoid bullshit and unsupported answers.
