NVIDIA/TransformerEngine
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
Builder

NVIDIA
NVIDIA • big-tech
Stars
3,253
Using upstream star count
Forks
684
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Sep 20, 2022
Project creation date
README Summary
NVIDIA TransformerEngine is a library designed to accelerate Transformer model training and inference on NVIDIA GPUs by leveraging advanced precision formats like FP8 and FP4. The library specifically targets Hopper, Ada, and Blackwell GPU architectures to deliver improved performance while reducing memory consumption. It provides optimized implementations for both training and inference workloads of Transformer-based models.
AI Dev Skills
Unmapped
Tags
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 27 days ago
7 Days
0
30 Days
0
90 Days
0
Quality
production- Quality
- high
- Maturity
- production
Categories
PM Skills
Languages
Timeline
- Project created
- Sep 20, 2022
- Forked
- Mar 14, 2026
- Your last push
- 27 days ago
- Upstream last push
- 6 days ago
- Tracked since
- Mar 17, 2026
Similar Repos
pgvector cosine similarity · $0
Loading…