Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/megablocks
Library/megablocksForked

databricks/megablocks

megablocks

MegaBlocks is a light-weight library for mixture-of-experts (MoE) training. The core of the system is efficient "dropless-MoE" ([dMoE](megablocks/laye

View on GitHub↗Upstream databricks/megablocks↗

Builder

databricks

databricks

databricks • individual

Stars

1,566

Using upstream star count

Forks

228

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jan 26, 2023

Project creation date

README Summary

MegaBlocks is a light-weight library for mixture-of-experts (MoE) training. The core of the system is efficient "dropless-MoE" ([dMoE](megablocks/layers/dmoe.py), [paper](https://arxiv.org/abs/2211.15841)) and standard [MoE](megablocks/layers/moe.py) layers.

Community Evaluation

Loading…

AI Dev Skills

Unmapped

CUDA Kernel OptimizationDistributed TrainingGPU ComputingLarge Language Model TrainingMemory OptimizationMixture of Experts ArchitectureSparse Neural NetworksTransformer Architecture

Tags

CUDA Kernel OptimizationDistributed TrainingGPU ComputingLarge Language Model TrainingMemory OptimizationMixture of Experts ArchitectureSparse Neural NetworksTransformer ArchitectureDockerForkedLLM ServingMachine LearningMistralNumPyPyTorchResearch / PapersTransformersvLLM

Taxonomy

AI Trends

Large Language ModelsModel EfficiencySparse ComputingFoundation Models

category

Inference & ServingFoundation ModelsModel TrainingMLOps & InfrastructureLearning ResourcesData Science & Analytics

Deployment Context

Cloud APISelf-hostedOn-premise

Modalities

Text

Skill Areas

Mixture of Experts ArchitectureSparse Neural NetworksCUDA Kernel OptimizationLarge Language Model TrainingTransformer ArchitectureDistributed TrainingMemory OptimizationGPU Computing

tag

DockerForkedLLM ServingMachine LearningMistralNumPyPyTorchResearch / PapersTransformersvLLM

Use Cases

Large Language Model TrainingEfficient Transformer InferenceSparse Model DeploymentMulti-billion Parameter Model Scaling

Recent Activity

Updated 11 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

beta
Quality
high
Maturity
beta

Categories

Inference & ServingPrimaryMLOps & InfrastructureLearning ResourcesData Science & AnalyticsFoundation ModelsModel TrainingSearch & KnowledgeOther AI / ML

PM Skills

Cost & EfficiencyScale & Reliability

Languages

Python100.0%

Timeline

Project created
Jan 26, 2023
Forked
Mar 16, 2026
Your last push
11 months ago
Upstream last push
2 months ago
Tracked since
Jun 26, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…