Leaderboard
CompMix
a benchmark evaluating QA methods that operate over a mixture of heterogeneous input sources (KB, text, tables, infoboxes).
a benchmark evaluating QA methods that operate over a mixture of heterogeneous input sources (KB, text, tables, infoboxes).
a benchmark that evaluates large language models' ability to answer medical questions across multiple languages.