Library/FasterTransformer
Library/FasterTransformerForked

NVIDIA/FasterTransformer

FasterTransformer

Transformer related optimization, including BERT, GPT

Builder

NVIDIA

NVIDIA

NVIDIA • big-tech

Stars

6,410

Using upstream star count

Forks

934

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Apr 2, 2021

Project creation date

README Summary

FasterTransformer is NVIDIA's C++ library providing highly optimized implementations of popular Transformer models including BERT and GPT architectures. It delivers significant speedups for inference and training through CUDA optimizations, mixed precision, and efficient memory management techniques.

AI Dev Skills

Unmapped

Transformer Architecture OptimizationCUDA Kernel DevelopmentGPU Memory ManagementModel Inference AccelerationBERT OptimizationGPT OptimizationDeep Learning Systems EngineeringHigh-Performance ComputingModel Serving Infrastructure

Tags

Transformer Architecture OptimizationCUDA Kernel DevelopmentGPU Memory ManagementModel Inference AccelerationBERT OptimizationGPT OptimizationDeep Learning Systems EngineeringHigh-Performance ComputingModel Serving InfrastructureLatency-critical NLP ApplicationsTextProduction GPT DeploymentHigh-throughput Language Model ServingSelf-hostedEfficient AI InfrastructureProduction AI SystemsCloud APILarge-scale BERT InferenceModel OptimizationReal-time Text GenerationOn-premiseC++

Taxonomy

Recent Activity

Updated 2 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

NLP & TextPrimaryOther AI / MLDev Tools & AutomationInference & ServingML Platform & InfrastructureFoundation Models

PM Skills

Developer Platform

Languages

C++100.0%

Timeline

Project created
Apr 2, 2021
Forked
Mar 14, 2026
Your last push
2 years ago
Upstream last push
2 years ago
Tracked since
Mar 27, 2024

Similar Repos

pgvector cosine similarity · $0

Loading…