Library/hh-rlhf
Library/hh-rlhfForked

anthropics/hh-rlhf

hh-rlhf

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

Builder

Anthropic

Anthropic

anthropics • ai-lab

Stars

1,833

Using upstream star count

Forks

153

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Apr 10, 2022

Project creation date

README Summary

This repository contains human preference data used for training AI assistants through reinforcement learning from human feedback (RLHF). The dataset includes conversations between humans and AI assistants, with human annotations indicating which responses are more helpful and harmless. This data was used in Anthropic's research on training AI systems to be both helpful and safe through human feedback.

AI Dev Skills

Unmapped

Reinforcement Learning from Human Feedback (RLHF)Human Preference LearningAI Safety and AlignmentLanguage Model TrainingDataset Curation and ManagementHuman-AI Interaction DesignPreference ModelingConstitutional AI

Tags

Reinforcement Learning from Human Feedback (RLHF)Human Preference LearningAI Safety and AlignmentLanguage Model TrainingDataset Curation and ManagementHuman-AI Interaction DesignPreference ModelingConstitutional AITextModel Training PipelineAI Safety ResearchTraining Safe AI AssistantsHuman-AI AlignmentResponsible AI DevelopmentHuman Preference Data CollectionLanguage Model AlignmentAI SafetyResearch InfrastructureConversational AI Development

Taxonomy

Recent Activity

Updated 10 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
high
Maturity
research

Categories

MLOps & InfrastructurePrimaryLearning ResourcesEvals & BenchmarkingML Platform & InfrastructureSafety & AlignmentSearch & KnowledgeOther AI / MLModel Training

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

No language breakdown recorded.

Timeline

Project created
Apr 10, 2022
Forked
Mar 14, 2026
Your last push
10 months ago
Upstream last push
10 months ago
Tracked since
Jun 17, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…