Library/human-evalForked

openai/human-eval

human-eval

Code for the paper "Evaluating Large Language Models Trained on Code"

View on GitHub↗Upstream openai/human-eval↗

Builder

OpenAI

openai • ai-lab

Stars

3,246

Using upstream star count

Forks

442

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Jul 6, 2021

Project creation date

README Summary

This is an evaluation harness for the HumanEval problem solving dataset described in the paper "[Evaluating Large Language Models Trained on Code](https://arxiv.org/abs/2107.03374)".

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Automated Code TestingCode Generation AssessmentLarge Language Model EvaluationMachine Learning BenchmarkingNatural Language to Code TranslationProgramming Language Understanding

Recent Activity

Updated 1 years ago

7 Days

30 Days

90 Days

Quality

research

Quality: high
Maturity: research

PM Skills

Data & Evaluation

Languages

Python100.0%

Timeline

Project created: Jul 6, 2021
Forked: Mar 14, 2026
Your last push: 1 years ago
Upstream last push: 1 years ago
Tracked since: Jan 17, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…

Library/human-evalForked

openai/human-eval

human-eval

Code for the paper "Evaluating Large Language Models Trained on Code"

View on GitHub↗Upstream openai/human-eval↗

Builder

OpenAI

openai • ai-lab

Stars

3,246

Using upstream star count

Forks

442

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Jul 6, 2021

Project creation date

README Summary

This is an evaluation harness for the HumanEval problem solving dataset described in the paper "[Evaluating Large Language Models Trained on Code](https://arxiv.org/abs/2107.03374)".

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Automated Code TestingCode Generation AssessmentLarge Language Model EvaluationMachine Learning BenchmarkingNatural Language to Code TranslationProgramming Language Understanding

Recent Activity

Updated 1 years ago

7 Days

30 Days

90 Days

Quality

research

Quality: high
Maturity: research

PM Skills

Data & Evaluation

Languages

Python100.0%

Timeline

Project created: Jul 6, 2021
Forked: Mar 14, 2026
Your last push: 1 years ago
Upstream last push: 1 years ago
Tracked since: Jan 17, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…

human-eval

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos

human-eval

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos