NVIDIA/Megatron-LM
Megatron-LM
Ongoing research training transformer models at scale
Builder

NVIDIA
NVIDIA • big-tech
Stars
15,894
Using upstream star count
Forks
3,785
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Mar 21, 2019
Project creation date
README Summary
NVIDIA's Megatron-LM is a large-scale transformer model training framework designed for efficient distributed training of language models. It implements model parallelism techniques and optimizations specifically for training very large transformer architectures on multiple GPUs across multiple nodes. The framework focuses on scaling transformer models beyond what can fit on a single GPU through advanced parallelization strategies.
AI Dev Skills
Unmapped
Tags
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 27 days ago
7 Days
0
30 Days
0
90 Days
0
Quality
research- Quality
- high
- Maturity
- research
Categories
PM Skills
Languages
Timeline
- Project created
- Mar 21, 2019
- Forked
- Mar 14, 2026
- Your last push
- 27 days ago
- Upstream last push
- 6 days ago
- Tracked since
- Mar 17, 2026
Similar Repos
pgvector cosine similarity · $0
Loading…