Leaderboard
MathEval
a comprehensive benchmarking platform designed to evaluate large models' mathematical abilities across 20 fields and nearly 30,000 math problems.
a comprehensive benchmarking platform designed to evaluate large models' mathematical abilities across 20 fields and nearly 30,000 math problems.
focuses on understanding how these models perform in various scenarios and analyzing results from an interpretability perspective.