Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/deepeval
Library/deepevalForked

confident-ai/deepeval

deepeval

The LLM Evaluation Framework

View on GitHub↗Upstream confident-ai/deepeval↗

Builder

Confident AI

Confident AI

confident-ai • startup

Stars

15,660

Using upstream star count

Forks

1,466

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Aug 10, 2023

Project creation date

README Summary

<p align="center"> <img src="https://github.com/confident-ai/deepeval/blob/main/docs/static/img/deepeval.png" alt="DeepEval Logo" width="100%"> </p>

Community Evaluation

Loading…

AI Dev Skills

Unmapped

AI Safety TestingAutomated Testing PipelinesBias Detection and MeasurementHallucination DetectionLLM Evaluation MetricsModel BenchmarkingModel Performance TestingPrompt Engineering ValidationRetrieval-Augmented Generation EvaluationToxicity Assessment

Tags

AI Safety TestingAutomated Testing PipelinesBias Detection and MeasurementHallucination DetectionLLM Evaluation MetricsModel BenchmarkingModel Performance TestingPrompt Engineering ValidationRetrieval-Augmented Generation EvaluationToxicity AssessmentAI AgentsBackendBenchmarkingCurated ListDeepEvalDeepSeekEvalsForkedHuggingFaceHumanEvalLangChainLarge Language ModelsLlamaIndexMMLUOpenAIPythonRAGASReal-Time / StreamingRed TeamingRoadmapTool UseTutorial

Taxonomy

AI Trends

AI SafetyLLM EvaluationResponsible AIMLOpsAI Testing and ValidationCompound AI Systems

category

Evals & BenchmarkingFoundation ModelsAI AgentsRAG & RetrievalInference & ServingDev Tools & AutomationLearning ResourcesSecurity & Safety

Deployment Context

Cloud APISelf-hostedCI/CD PipelinesDevelopment Environment

Industries

Developer ToolsAI/ML PlatformsEnterprise SoftwareResearch and Academia

Modalities

Text

Skill Areas

LLM Evaluation MetricsModel Performance TestingBias Detection and MeasurementHallucination DetectionToxicity AssessmentRetrieval-Augmented Generation EvaluationPrompt Engineering ValidationAI Safety TestingModel BenchmarkingAutomated Testing Pipelines

tag

AI AgentsBackendBenchmarkingCurated ListDeepEvalDeepSeekEvalsForkedHuggingFaceHumanEvalLangChainLarge Language ModelsLlamaIndexMMLUOpenAIPythonRAGASReal-Time / StreamingRed TeamingRoadmapTool UseTutorial

Use Cases

LLM Application TestingModel Quality AssuranceAI Safety ValidationAutomated Performance BenchmarkingBias and Fairness TestingHallucination DetectionRAG System EvaluationPrompt Optimization TestingModel Comparison and Selection

Recent Activity

Updated 3 months ago

7 Days

0

30 Days

0

90 Days

0

Merge pull request #1 from confident-ai/main

kimmymakesmoves • Feb 6, 2026

94d9c07

new release

Jeffrey Ip • Feb 4, 2026

6bcc63c

.

Jeffrey Ip • Feb 4, 2026

8d20552

Quality

production
Quality
high
Maturity
production

Categories

Evals & BenchmarkingPrimaryRAG & RetrievalInference & ServingDev Tools & AutomationLearning ResourcesSecurity & SafetyFoundation ModelsAI AgentsSearch & KnowledgeOther AI / ML

PM Skills

Safety & AlignmentScale & ReliabilityData & EvaluationDeveloper PlatformAI-Native Architecture

Languages

Python100.0%

Timeline

Project created
Aug 10, 2023
Forked
Nov 8, 2025
Your last push
3 months ago
Upstream last push
20 days ago
Tracked since
Feb 6, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…