Library/trlForked

huggingface/trl

trl

Train transformer language models with reinforcement learning.

Builder

HuggingFace

HuggingFace

huggingface • ai-lab

Stars

17,894

Using upstream star count

Forks

2,604

Using upstream fork count

Open Issues

0

Activity Score

0/100

94 commits in 30d

Created

Mar 27, 2020

Project creation date

README Summary

TRL (Transformer Reinforcement Learning) is a library for training transformer language models using reinforcement learning techniques. It provides tools and algorithms for fine-tuning language models with human feedback, including PPO (Proximal Policy Optimization) and other RL methods. The library integrates seamlessly with Hugging Face's transformers ecosystem for scalable RL training of language models.

AI Dev Skills

Unmapped

Reinforcement Learning from Human Feedback (RLHF)Proximal Policy Optimization (PPO)Direct Preference Optimization (DPO)Transformer Fine-tuningLanguage Model TrainingHuman Preference LearningPolicy Gradient MethodsReward ModelingConstitutional AI

Tags

Reinforcement Learning from Human Feedback (RLHF)Proximal Policy Optimization (PPO)Direct Preference Optimization (DPO)Transformer Fine-tuningLanguage Model TrainingHuman Preference LearningPolicy Gradient MethodsReward ModelingConstitutional AIChat Model Fine-tuningAI Safety TrainingInstruction Following TrainingAI SafetyLanguage Model AlignmentOn-premiseSelf-hostedConversational AI DevelopmentLarge Language ModelsHuman-AI AlignmentInstruction FollowingTextCloud APIPython

Taxonomy

Recent Activity

Updated 27 days ago

7 Days

4

30 Days

94

90 Days

416

Quality

production
Quality
high
Maturity
production

Categories

Foundation ModelsPrimaryEvals & BenchmarkingOther AI / MLModel TrainingSafety & Alignment

PM Skills

Scale & Reliability

Languages

Python100.0%

Timeline

Project created
Mar 27, 2020
Forked
Mar 13, 2026
Your last push
27 days ago
Upstream last push
6 days ago
Tracked since
Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…