Library/leetcuda
Library/leetcudaForked

xlite-dev/LeetCUDA

leetcuda

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Builder

xlite-dev

xlite-dev

xlite-dev • individual

Stars

10,096

Using upstream star count

Forks

1,024

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Dec 17, 2022

Project creation date

README Summary

LeetCUDA is a comprehensive learning resource for CUDA programming designed for beginners, featuring over 200 CUDA kernels and integration with PyTorch. The repository covers modern CUDA concepts including Tensor Cores, half-precision GEMM (HGEMM), and FlashAttention-2 with Matrix Multiply-Accumulate operations. It serves as both educational material and practical examples for GPU programming in machine learning contexts.

AI Dev Skills

Unmapped

CUDA ProgrammingGPU ComputingTensor Core OptimizationHalf-Precision Matrix MultiplicationFlashAttention ImplementationPyTorch CUDA IntegrationParallel ComputingDeep Learning OptimizationMemory ManagementKernel Development

Tags

CUDA ProgrammingGPU ComputingTensor Core OptimizationHalf-Precision Matrix MultiplicationFlashAttention ImplementationPyTorch CUDA IntegrationParallel ComputingDeep Learning OptimizationMemory ManagementKernel DevelopmentDeep Learning AccelerationMemory-Efficient Attention MechanismsCUDA Programming EducationPyTorch Custom OperationsPyTorch CUDA ExtensionsSelf-hostedHardware AccelerationOn-premiseGPU Kernel DevelopmentEfficient TransformersCustom CUDA KernelsLow-level GPU OptimizationTensor Core ProgrammingAttention Mechanism ImplementationFlash Attention ImplementationDeveloper ToolsTensorEducationCuda

Taxonomy

Recent Activity

Updated 21 days ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
medium
Maturity
research

Categories

Dev Tools & AutomationPrimaryInference & ServingOther AI / MLFoundation ModelsModel Training

PM Skills

Developer Platform

Languages

Cuda100.0%

Timeline

Project created
Dec 17, 2022
Forked
Mar 23, 2026
Your last push
21 days ago
Upstream last push
6 days ago
Tracked since
Mar 23, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…