Library/Megatron-LM
Library/Megatron-LMForked

NVIDIA/Megatron-LM

Megatron-LM

Ongoing research training transformer models at scale

Builder

NVIDIA

NVIDIA

NVIDIA • big-tech

Stars

15,894

Using upstream star count

Forks

3,785

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Mar 21, 2019

Project creation date

README Summary

NVIDIA's Megatron-LM is a large-scale transformer model training framework designed for efficient distributed training of language models. It implements model parallelism techniques and optimizations specifically for training very large transformer architectures on multiple GPUs across multiple nodes. The framework focuses on scaling transformer models beyond what can fit on a single GPU through advanced parallelization strategies.

AI Dev Skills

Unmapped

Large Language Model TrainingDistributed Deep LearningModel ParallelismData ParallelismPipeline ParallelismTransformer ArchitectureGPU ComputingMixed Precision TrainingGradient AccumulationDistributed SystemsHigh Performance ComputingMemory Optimization

Tags

Large Language Model TrainingDistributed Deep LearningModel ParallelismData ParallelismPipeline ParallelismTransformer ArchitectureGPU ComputingMixed Precision TrainingGradient AccumulationDistributed SystemsHigh Performance ComputingMemory OptimizationCloud InfrastructureFoundation ModelsDistributed Training ResearchMulti-billion Parameter Model TrainingLarge Language Model Pre-trainingScaling LawsLarge Language ModelsDistributed AI TrainingScalable Transformer TrainingTextMulti-GPU ClustersFoundation Model DevelopmentPython

Taxonomy

Recent Activity

Updated 27 days ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
high
Maturity
research

Categories

Foundation ModelsPrimaryMLOps & InfrastructureDev Tools & AutomationCloud & PlatformsLearning ResourcesInference & ServingML Platform & InfrastructureSearch & KnowledgeOther AI / MLModel Training

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

Python100.0%

Timeline

Project created
Mar 21, 2019
Forked
Mar 14, 2026
Your last push
27 days ago
Upstream last push
6 days ago
Tracked since
Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…