Library/weak-to-strong
Library/weak-to-strongForked

openai/weak-to-strong

weak-to-strong

This repository implements weak-to-strong generalization, a research framework where weaker AI models are used to supervise and train stronger models.

Builder

OpenAI

OpenAI

openai • ai-lab

Stars

2,552

Using upstream star count

Forks

310

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Dec 13, 2023

Project creation date

README Summary

This repository implements weak-to-strong generalization, a research framework where weaker AI models are used to supervise and train stronger models. The work explores whether strong models can learn to perform better than their weak supervisors, which has implications for AI alignment and safety as models become more capable than humans.

AI Dev Skills

Unmapped

AI Safety ResearchModel AlignmentWeak-to-Strong GeneralizationSupervised LearningNeural Network TrainingAI GovernanceMachine Learning Research Methodology

Tags

AI Safety ResearchModel AlignmentWeak-to-Strong GeneralizationSupervised LearningNeural Network TrainingAI GovernanceMachine Learning Research MethodologyAI SafetyScalable OversightModel Generalization StudiesAI AlignmentWeak Supervision ExperimentsAI Safety EvaluationTextAI Model Alignment ResearchSelf-hostedResearch EnvironmentPython

Taxonomy

Recent Activity

Updated 1 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
medium
Maturity
research

Categories

Safety & AlignmentPrimarySearch & KnowledgeOther AI / MLLearning ResourcesEvals & BenchmarkingModel TrainingComputer Vision

PM Skills

Product Discovery

Languages

Python100.0%

Timeline

Project created
Dec 13, 2023
Forked
Mar 14, 2026
Your last push
1 years ago
Upstream last push
1 years ago
Tracked since
May 19, 2024

Similar Repos

pgvector cosine similarity · $0

Loading…