NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Builder
NVIDIA
NVIDIA • big-tech
Stars
16,507
Using upstream star count
Forks
4,013
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Mar 21, 2019
Project creation date
Megatron-LM and Megatron Core =============================
Unmapped
category
Deployment Context
Modalities
Skill Areas
tag
Updated 2 months ago
7 Days
0
30 Days
0
90 Days
20
[Megatron-FSDP] Support 'auto' argument which defaults to pre-MixedPrecisionPolicy be… (#3810)
Cory Ye • Mar 17, 2026
Use fp32 state dtypes for Mamba inference functional test (#3888)
Keshav Santhanam • Mar 16, 2026
Fix quantize.py script and support packed sequences in pretrain_gpt.py (#3564)
Asha Anoosheh • Mar 16, 2026
pgvector cosine similarity · $0
Loading…