NVIDIA/FasterTransformer
FasterTransformer
Transformer related optimization, including BERT, GPT
Builder

NVIDIA
NVIDIA • big-tech
Stars
6,410
Using upstream star count
Forks
934
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Apr 2, 2021
Project creation date
README Summary
FasterTransformer is NVIDIA's C++ library providing highly optimized implementations of popular Transformer models including BERT and GPT architectures. It delivers significant speedups for inference and training through CUDA optimizations, mixed precision, and efficient memory management techniques.
AI Dev Skills
Unmapped
Transformer Architecture OptimizationCUDA Kernel DevelopmentGPU Memory ManagementModel Inference AccelerationBERT OptimizationGPT OptimizationDeep Learning Systems EngineeringHigh-Performance ComputingModel Serving Infrastructure
Tags
Transformer Architecture OptimizationCUDA Kernel DevelopmentGPU Memory ManagementModel Inference AccelerationBERT OptimizationGPT OptimizationDeep Learning Systems EngineeringHigh-Performance ComputingModel Serving InfrastructureLatency-critical NLP ApplicationsTextProduction GPT DeploymentHigh-throughput Language Model ServingSelf-hostedEfficient AI InfrastructureProduction AI SystemsCloud APILarge-scale BERT InferenceModel OptimizationReal-time Text GenerationOn-premiseC++
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 2 years ago
7 Days
0
30 Days
0
90 Days
0
Quality
production- Quality
- high
- Maturity
- production
Categories
NLP & TextPrimaryOther AI / MLDev Tools & AutomationInference & ServingML Platform & InfrastructureFoundation Models
PM Skills
Developer Platform
Languages
C++100.0%
Timeline
- Project created
- Apr 2, 2021
- Forked
- Mar 14, 2026
- Your last push
- 2 years ago
- Upstream last push
- 2 years ago
- Tracked since
- Mar 27, 2024
Similar Repos
pgvector cosine similarity · $0
Loading…