Leaderboard
AlpacaEval
An Automatic Evaluator for Instruction-following Language Models using Nous benchmark suite.
An Automatic Evaluator for Instruction-following Language Models using Nous benchmark suite.
a Swedish language understanding benchmark that evaluates natural language processing (NLP) models on various tasks such as argumentation analysis, semantic similarity, and textual entailment.