openai/human-eval
human-eval
Code for the paper "Evaluating Large Language Models Trained on Code"
Builder

OpenAI
openai • ai-lab
Stars
3,185
Using upstream star count
Forks
442
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Jul 6, 2021
Project creation date
README Summary
HumanEval is a dataset and evaluation framework for measuring the code generation capabilities of large language models. It consists of 164 hand-written programming problems with unit tests, designed to evaluate whether models can generate functionally correct Python code from natural language descriptions. The repository provides tools for running evaluations and measuring pass@k metrics for code completion tasks.
AI Dev Skills
Unmapped
Tags
Taxonomy
Deployment Context
Industries
Skill Areas
Recent Activity
Updated 1 years ago
7 Days
0
30 Days
0
90 Days
0
Quality
research- Quality
- high
- Maturity
- research
Categories
PM Skills
Languages
Timeline
- Project created
- Jul 6, 2021
- Forked
- Mar 14, 2026
- Your last push
- 1 years ago
- Upstream last push
- 1 years ago
- Tracked since
- Jan 17, 2025
Similar Repos
pgvector cosine similarity · $0
Loading…