Leaderboard
MMedBench
a benchmark that evaluates large language models' ability to answer medical questions across multiple languages.
a benchmark that evaluates large language models' ability to answer medical questions across multiple languages.
a benchmark for evaluating the performance of large language models (LLMs) in various tasks related to both textual and visual imagination.