Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/hh-rlhf
Library/hh-rlhfForked

anthropics/hh-rlhf

hh-rlhf

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

View on GitHub↗Upstream anthropics/hh-rlhf↗

Builder

Anthropic

Anthropic

anthropics • ai-lab

Stars

1,839

Using upstream star count

Forks

159

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Apr 10, 2022

Project creation date

README Summary

> [!NOTE] > This github repo is now deprecated in favor of the HuggingFace hosted repository which contains the same data: https://huggingface.co/datasets/Anthropic/hh-rlhf

Community Evaluation

Loading…

AI Dev Skills

Unmapped

AI Safety and AlignmentConstitutional AIDataset Curation and ManagementHuman-AI Interaction DesignHuman Preference LearningLanguage Model TrainingPreference ModelingReinforcement Learning from Human Feedback (RLHF)

Tags

AI Safety and AlignmentConstitutional AIDataset Curation and ManagementHuman-AI Interaction DesignHuman Preference LearningLanguage Model TrainingPreference ModelingReinforcement Learning from Human Feedback (RLHF)Anthropic / ClaudeForkedHuggingFaceLarge Language ModelsRed TeamingReinforcement LearningResearch / PapersRLHF

Taxonomy

AI Trends

AI SafetyHuman-AI AlignmentConstitutional AIResponsible AI Development

category

Foundation ModelsModel TrainingEvals & BenchmarkingLearning ResourcesSecurity & Safety

Deployment Context

Research InfrastructureModel Training Pipeline

Modalities

Text

Skill Areas

Reinforcement Learning from Human Feedback (RLHF)Human Preference LearningAI Safety and AlignmentLanguage Model TrainingDataset Curation and ManagementHuman-AI Interaction DesignPreference ModelingConstitutional AI

tag

Anthropic / ClaudeForkedHuggingFaceLarge Language ModelsRLHFRed TeamingReinforcement LearningResearch / Papers

Use Cases

Training Safe AI AssistantsHuman Preference Data CollectionAI Safety ResearchLanguage Model AlignmentConversational AI Development

Recent Activity

Updated 11 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
high
Maturity
research

Categories

Evals & BenchmarkingPrimarySearch & KnowledgeFoundation ModelsModel TrainingLearning ResourcesSecurity & Safety

PM Skills

Safety & Alignment

Languages

No language breakdown recorded.

Timeline

Project created
Apr 10, 2022
Forked
Mar 14, 2026
Your last push
11 months ago
Upstream last push
11 months ago
Tracked since
Jun 17, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…