Leaderboard
CompMix
a benchmark evaluating QA methods that operate over a mixture of heterogeneous input sources (KB, text, tables, infoboxes).
a benchmark evaluating QA methods that operate over a mixture of heterogeneous input sources (KB, text, tables, infoboxes).
a multimodal question-answering benchmark designed to evaluate AI models' cognitive ability to understand human beliefs and goals.