Library/lm-evaluation-harnessForked

EleutherAI/lm-evaluation-harness

lm-evaluation-harness

A framework for few-shot evaluation of language models.

View on GitHub↗Upstream EleutherAI/lm-evaluation-harness↗

Builder

EleutherAI

EleutherAI • ai-lab

Stars

12,748

Using upstream star count

Forks

3,301

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Aug 28, 2020

Project creation date

README Summary

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10256836.svg)](https://doi.org/10.5281/zenodo.10256836)

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Benchmark DesignFew-shot LearningLanguage Model EvaluationModel ComparisonModel Performance AssessmentNatural Language ProcessingPrompt EngineeringStandardized Testing FrameworksStatistical Analysis

Taxonomy

AI Trends

Language Model Evaluation AI Safety Responsible AI Model Interpretability Benchmark Standardization

Recent Activity

Updated 2 months ago

7 Days

30 Days

90 Days

Fix correctness issues in Arabic normalization and prompt loading (#3589)

Rin • Mar 16, 2026

7507703

Skip caching None responses in async generation path (#3633)

Joshua Swanson • Mar 16, 2026

d47ed3e

replace all CohereForAI with CohereLabs (#3631)

Júlia Falcão • Mar 16, 2026

6e23116

Quality

production

Quality: high
Maturity: production

PM Skills

Cost & EfficiencyUser ExperienceScale & ReliabilityData & EvaluationDeveloper PlatformAI-Native Architecture

Languages

Python100.0%

Timeline

Project created: Aug 28, 2020
Forked: Mar 13, 2026
Your last push: 2 months ago
Upstream last push: 23 days ago
Tracked since: Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

Library/lm-evaluation-harnessForked

EleutherAI/lm-evaluation-harness

lm-evaluation-harness

A framework for few-shot evaluation of language models.

View on GitHub↗Upstream EleutherAI/lm-evaluation-harness↗

Builder

EleutherAI

EleutherAI • ai-lab

Stars

12,748

Using upstream star count

Forks

3,301

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Aug 28, 2020

Project creation date

README Summary

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10256836.svg)](https://doi.org/10.5281/zenodo.10256836)

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Benchmark DesignFew-shot LearningLanguage Model EvaluationModel ComparisonModel Performance AssessmentNatural Language ProcessingPrompt EngineeringStandardized Testing FrameworksStatistical Analysis

Taxonomy

AI Trends

Language Model Evaluation AI Safety Responsible AI Model Interpretability Benchmark Standardization

Recent Activity

Updated 2 months ago

7 Days

30 Days

90 Days

Fix correctness issues in Arabic normalization and prompt loading (#3589)

Rin • Mar 16, 2026

7507703

Skip caching None responses in async generation path (#3633)

Joshua Swanson • Mar 16, 2026

d47ed3e

replace all CohereForAI with CohereLabs (#3631)

Júlia Falcão • Mar 16, 2026

6e23116

Quality

production

Quality: high
Maturity: production

PM Skills

Cost & EfficiencyUser ExperienceScale & ReliabilityData & EvaluationDeveloper PlatformAI-Native Architecture

Languages

Python100.0%

Timeline

Project created: Aug 28, 2020
Forked: Mar 13, 2026
Your last push: 2 months ago
Upstream last push: 23 days ago
Tracked since: Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

lm-evaluation-harness

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos

lm-evaluation-harness

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos