Leaderboard
BeHonest
A pioneering benchmark specifically designed to assess honesty in LLMs comprehensively.
A pioneering benchmark specifically designed to assess honesty in LLMs comprehensively.
a benchmark that evaluates large language models' ability to answer medical questions across multiple languages.