Library/flash-attention
Library/flash-attentionForked

Dao-AILab/flash-attention

flash-attention

Fast and memory-efficient exact attention

Builder

Dao-AILab

Dao-AILab

Dao-AILab • individual

Stars

23,111

Using upstream star count

Forks

2,577

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

May 19, 2022

Project creation date

README Summary

FlashAttention is a fast and memory-efficient implementation of exact attention that reduces memory usage from quadratic to linear in sequence length. It achieves 2-4x speedup over standard attention while being mathematically equivalent, making it ideal for training long sequence transformers. The repository provides CUDA kernels and Python interfaces for easy integration into existing PyTorch models.

AI Dev Skills

Unmapped

Transformer Architecture OptimizationAttention Mechanism DesignGPU Kernel DevelopmentCUDA ProgrammingMemory-Efficient Deep LearningComputational Complexity AnalysisHigh-Performance ComputingAlgorithm OptimizationLarge Language Model TrainingNumerical Stability in Neural NetworksTransformer ArchitectureAttention Mechanism OptimizationGPU Kernel ProgrammingCUDA/GPU AccelerationMixed Precision TrainingAttention MechanismsGPU OptimizationMemory Efficiency in Deep LearningNumerical StabilityLow-level Performance Tuning

Tags

Transformer Architecture OptimizationAttention Mechanism DesignGPU Kernel DevelopmentCUDA ProgrammingMemory-Efficient Deep LearningComputational Complexity AnalysisHigh-Performance ComputingAlgorithm OptimizationLarge Language Model TrainingNumerical Stability in Neural NetworksTransformer ArchitectureAttention Mechanism OptimizationGPU Kernel ProgrammingCUDA/GPU AccelerationMixed Precision TrainingAttention MechanismsGPU OptimizationMemory Efficiency in Deep LearningNumerical StabilityLow-level Performance Tuning

Taxonomy

Recent Activity

Updated 18 days ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

Dev Tools & AutomationPrimaryFoundation ModelsModel TrainingInference & ServingOther AI / ML

PM Skills

Developer Platform

Languages

Python100.0%

Timeline

Project created
May 19, 2022
Forked
Mar 28, 2026
Your last push
18 days ago
Upstream last push
7 days ago
Tracked since
Mar 26, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…