Library/megablocks
Library/megablocksForked

databricks/megablocks

megablocks

MegaBlocks is a library for efficient training of large-scale sparse transformer models using mixture-of-experts (MoE) architectures.

Builder

databricks

databricks

databricks • individual

Stars

1,552

Using upstream star count

Forks

225

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jan 26, 2023

Project creation date

README Summary

MegaBlocks is a library for efficient training of large-scale sparse transformer models using mixture-of-experts (MoE) architectures. The library provides optimized CUDA kernels and distributed training capabilities to handle massive sparse models with billions of parameters efficiently.

AI Dev Skills

Unmapped

Mixture of Experts ArchitectureSparse Neural NetworksCUDA Kernel OptimizationLarge Language Model TrainingTransformer ArchitectureDistributed TrainingMemory OptimizationGPU Computing

Tags

Mixture of Experts ArchitectureSparse Neural NetworksCUDA Kernel OptimizationLarge Language Model TrainingTransformer ArchitectureDistributed TrainingMemory OptimizationGPU ComputingOn-premiseLarge Language ModelsCloud APISelf-hostedMulti-billion Parameter Model ScalingModel EfficiencySparse Model DeploymentSparse ComputingFoundation ModelsTextEfficient Transformer InferencePython

Taxonomy

Recent Activity

Updated 9 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

beta
Quality
high
Maturity
beta

Categories

Inference & ServingPrimaryOther AI / MLFoundation ModelsModel Training

PM Skills

Scale & Reliability

Languages

Python100.0%

Timeline

Project created
Jan 26, 2023
Forked
Mar 16, 2026
Your last push
9 months ago
Upstream last push
20 days ago
Tracked since
Jun 26, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…