
Evaluation
lighteval
a lightweight LLM evaluation suite that Hugging Face has been using internally.
a lightweight LLM evaluation suite that Hugging Face has been using internally.
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.