Library/Megatron-LMForked

NVIDIA/Megatron-LM

Megatron-LM

Ongoing research training transformer models at scale

View on GitHub↗Upstream NVIDIA/Megatron-LM↗

Builder

NVIDIA

NVIDIA • big-tech

Stars

16,507

Using upstream star count

Forks

4,013

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Mar 21, 2019

Project creation date

README Summary

Megatron-LM and Megatron Core =============================

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Taxonomy

AI Trends

Large Language Models Foundation Models Scaling Laws Distributed AI Training

Recent Activity

Updated 2 months ago

7 Days

30 Days

90 Days

[Megatron-FSDP] Support 'auto' argument which defaults to pre-MixedPrecisionPolicy be… (#3810)

Cory Ye • Mar 17, 2026

ff70b24

Use fp32 state dtypes for Mamba inference functional test (#3888)

Keshav Santhanam • Mar 16, 2026

72b10a8

Fix quantize.py script and support packed sequences in pretrain_gpt.py (#3564)

Asha Anoosheh • Mar 16, 2026

f89744b

Quality

research

Quality: high
Maturity: research

PM Skills

Cost & EfficiencyUser ExperienceData & Evaluation

Languages

Python100.0%

Timeline

Project created: Mar 21, 2019
Forked: Mar 14, 2026
Your last push: 2 months ago
Upstream last push: 16 days ago
Tracked since: Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

Library/Megatron-LMForked

NVIDIA/Megatron-LM

Megatron-LM

Ongoing research training transformer models at scale

View on GitHub↗Upstream NVIDIA/Megatron-LM↗

Builder

NVIDIA

NVIDIA • big-tech

Stars

16,507

Using upstream star count

Forks

4,013

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Mar 21, 2019

Project creation date

README Summary

Megatron-LM and Megatron Core =============================

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Taxonomy

AI Trends

Large Language Models Foundation Models Scaling Laws Distributed AI Training

Recent Activity

Updated 2 months ago

7 Days

30 Days

90 Days

[Megatron-FSDP] Support 'auto' argument which defaults to pre-MixedPrecisionPolicy be… (#3810)

Cory Ye • Mar 17, 2026

ff70b24

Use fp32 state dtypes for Mamba inference functional test (#3888)

Keshav Santhanam • Mar 16, 2026

72b10a8

Fix quantize.py script and support packed sequences in pretrain_gpt.py (#3564)

Asha Anoosheh • Mar 16, 2026

f89744b

Quality

research

Quality: high
Maturity: research

PM Skills

Cost & EfficiencyUser ExperienceData & Evaluation

Languages

Python100.0%

Timeline

Project created: Mar 21, 2019
Forked: Mar 14, 2026
Your last push: 2 months ago
Upstream last push: 16 days ago
Tracked since: Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

Megatron-LM

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos

Megatron-LM

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos