Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/human-eval
Library/human-evalForked

openai/human-eval

human-eval

Code for the paper "Evaluating Large Language Models Trained on Code"

View on GitHub↗Upstream openai/human-eval↗

Builder

OpenAI

OpenAI

openai • ai-lab

Stars

3,246

Using upstream star count

Forks

442

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jul 6, 2021

Project creation date

README Summary

This is an evaluation harness for the HumanEval problem solving dataset described in the paper "[Evaluating Large Language Models Trained on Code](https://arxiv.org/abs/2107.03374)".

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Automated Code TestingCode Generation AssessmentLarge Language Model EvaluationMachine Learning BenchmarkingNatural Language to Code TranslationProgramming Language Understanding

Tags

Automated Code TestingCode Generation AssessmentLarge Language Model EvaluationMachine Learning BenchmarkingNatural Language to Code TranslationProgramming Language UnderstandingEvalsForkedHumanEvalOpenAIPythonResearch / PapersSecurity

Taxonomy

AI Trends

Code Generation ModelsAI-Assisted ProgrammingLanguage Model Evaluation

category

Evals & BenchmarkingFoundation ModelsDev Tools & AutomationLearning ResourcesSecurity & Safety

Deployment Context

Self-hosted

Industries

Developer Tools

Modalities

CodeText

Skill Areas

Large Language Model EvaluationCode Generation AssessmentNatural Language to Code TranslationAutomated Code TestingMachine Learning BenchmarkingProgramming Language Understanding

tag

EvalsForkedHumanEvalOpenAIPythonResearch / PapersSecurity

Use Cases

Code Generation EvaluationLanguage Model BenchmarkingAI Coding Assistant AssessmentModel Performance Comparison

Recent Activity

Updated 1 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
high
Maturity
research

Categories

Evals & BenchmarkingPrimaryDev Tools & AutomationLearning ResourcesSecurity & SafetyFoundation ModelsSearch & KnowledgeOther AI / ML

PM Skills

Data & Evaluation

Languages

Python100.0%

Timeline

Project created
Jul 6, 2021
Forked
Mar 14, 2026
Your last push
1 years ago
Upstream last push
1 years ago
Tracked since
Jan 17, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…