Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/DeepGEMM
Library/DeepGEMMForked

deepseek-ai/DeepGEMM

DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

View on GitHub↗Upstream deepseek-ai/DeepGEMM↗

Builder

DeepSeek

DeepSeek

deepseek-ai • ai-lab

Stars

7,342

Using upstream star count

Forks

1,025

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

—

README Summary

DeepGEMM is a unified, high-performance tensor core kernel library that brings together the key computation primitives of modern large language models — GEMMs (FP8, FP4, BF16), fused MoE with overlapped communication (Mega MoE), MQA scoring for the lightning indexer, HyperConnection (HC), and more — into a single, cohesive CUDA codebase. All kernels are compiled at runtime via a lightweight Just-In-Time (JIT) module, requiring no CUDA compilation during installation.

Community Evaluation

Loading…

AI Dev Skills

No AI dev skills recorded.

Tags

AI SafetyActiveBenchmarkingC++DeepSeekForkedGPU / CUDAPyTorchPython

Taxonomy

category

Model TrainingEvals & BenchmarkingInference & ServingSecurity & Safety

tag

AI SafetyActiveBenchmarkingC++DeepSeekForkedGPU / CUDAPyTorchPython

Recent Activity

Updated 1 months ago

7 Days

0

30 Days

0

90 Days

2

[Public release 26/04] Introducing Mega MoE, FP4 Indexer and other features/fixes (#304)

Chenggang Zhao • Apr 17, 2026

7f2a703

Fix sync issue of TMEM alloc/dealloc (#292)

Ray Wang • Mar 22, 2026

d30fc36

fix: k_grouped_fp8_gemm_nt_contiguous crashes with n = 768 on H100 (#238)

Xin Qiu • Feb 25, 2026

35c4bc8

Quality

Quality signals are not available for this repo yet.

Categories

Evals & BenchmarkingPrimaryInference & ServingSecurity & SafetyModel TrainingSafety & AlignmentOther AI / ML

PM Skills

Safety & AlignmentData & Evaluation

Languages

Cuda100.0%

Timeline

Project created
—
Forked
Apr 21, 2026
Your last push
1 months ago
Upstream last push
25 days ago
Tracked since
Apr 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…