FELM
Leaderboard
FELM

a meta-benchmark that evaluates how well factuality evaluators assess the outputs of large language models (LLMs).

a meta-benchmark that evaluates how well factuality evaluators assess the outputs of large language models (LLMs).

Relevant Sites

Leave a Reply

Your email address will not be published. Required fields are marked *