LLMWay – The Way To LLM LLMWay – The Way To LLM
  • LLM Trends
    • Leaderboard
  • LLM Models
  • LLM Learning
    • Milestone Papers
  • LLM Inference
    • Inference Engines
  • LLM Training
    • Training Frameworks
    • Evaluation
  • Home
  • Blog
  • Submit Sites
Oobabooga Benchmark
Leaderboard
Oobabooga Benchmark

Link

A benchmark for LLM

Relevant Sites

ACLUE

an evaluation benchmark focused on ancient Chinese language comprehension.

InfiBench

a benchmark designed to evaluate large language models (LLMs) specifically in their ability to answer real-world coding-related questions.

TAT-QA

a large-scale question-answering benchmark focused on real-world financial data, integrating both tabular and textual information.

MathEval

a comprehensive benchmarking platform designed to evaluate large models' mathematical abilities across 20 fields and nearly 30,000 math problems.

VisualWebArena

a benchmark designed to assess the performance of multimodal web agents on realistic visually grounded tasks.

M3CoT

a benchmark that evaluates large language models on a variety of multimodal reasoning tasks, including language, natural and social sciences, physical and social commonsense, temporal reasoning, algebra, and geometry.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Copyright © 2025 LLMWay – The Way To LLM