Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/simple-evals
Library/simple-evalsForked

openai/simple-evals

simple-evals

**July 2025**: `simple-evals` will no longer be updated for new models or benchmark results. The repo will continue to host reference implementations

View on GitHub↗Upstream openai/simple-evals↗

Builder

OpenAI

OpenAI

openai • ai-lab

Stars

4,506

Using upstream star count

Forks

490

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Apr 11, 2024

Project creation date

README Summary

**July 2025**: `simple-evals` will no longer be updated for new models or benchmark results. The repo will continue to host reference implementations for **HealthBench**, **BrowseComp**, and **SimpleQA**.

Community Evaluation

Loading…

AI Dev Skills

Unmapped

AI TestingBenchmarkingModel EvaluationPerformance Assessment

Tags

AI TestingBenchmarkingModel EvaluationPerformance AssessmentAnthropic / ClaudeClaudeCurated ListEvalsForkedGPTGoogle AIHumanEvalLarge Language ModelsLlamaMMLUOpen SourceOpenAIPrompt EngineeringPythonResearch / Papers

Taxonomy

AI Trends

AI SafetyModel Evaluation

category

Foundation ModelsAI AgentsEvals & BenchmarkingCloud & PlatformsLearning Resources

Deployment Context

Self-hostedCloud API

Modalities

Text

Skill Areas

Model EvaluationBenchmarkingPerformance AssessmentAI Testing

tag

Anthropic / ClaudeBenchmarkingClaudeCurated ListEvalsForkedGPTGoogle AIHumanEvalLarge Language ModelsLlamaMMLUOpen SourceOpenAIPrompt EngineeringPythonResearch / Papers

Use Cases

AI Model BenchmarkingPerformance ComparisonModel SelectionEvaluation Pipeline

Recent Activity

Updated 10 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

prototype
Quality
medium
Maturity
prototype

Categories

Foundation ModelsPrimaryAI AgentsEvals & BenchmarkingSearch & KnowledgeOther AI / MLCloud & PlatformsLearning Resources

PM Skills

Data & Evaluation

Languages

Python100.0%

Timeline

Project created
Apr 11, 2024
Forked
Mar 14, 2026
Your last push
10 months ago
Upstream last push
1 months ago
Tracked since
Jul 31, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…