Leaderboard
BeHonest
A pioneering benchmark specifically designed to assess honesty in LLMs comprehensively.
A pioneering benchmark specifically designed to assess honesty in LLMs comprehensively.
a comprehensive benchmarking platform designed to evaluate large models' mathematical abilities across 20 fields and nearly 30,000 math problems.