Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/mlc-llm
Library/mlc-llmForked

mlc-ai/mlc-llm

mlc-llm

Universal LLM Deployment Engine with ML Compilation

View on GitHub↗Upstream mlc-ai/mlc-llm↗

Builder

mlc-ai

mlc-ai

mlc-ai • individual

Stars

22,728

Using upstream star count

Forks

2,058

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Apr 29, 2023

Project creation date

README Summary

[![Installation](https://img.shields.io/badge/docs-latest-green)](https://llm.mlc.ai/docs/) [![License](https://img.shields.io/badge/license-apache_2-blue)](https://github.com/mlc-ai/mlc-llm/blob/main/LICENSE) [![Join Discoard](https://img.shields.io/badge/Join-Discord-7289DA?logo=discord&logoColor=white)](https://discord.gg/9Xpy2HGBuD) [![Related Repository: WebLLM](https://img.shields.io/badge/Related_Repo-WebLLM-fafbfc?logo=github)](https://github.com/mlc-ai/web-llm/)

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Cross-Platform Model InferenceGPU Programming and OptimizationHardware-Specific AccelerationLarge Language Model DeploymentML Compilation and OptimizationMobile AI DeploymentQuantization TechniquesRuntime OptimizationWebAssembly Integration

Tags

Cross-Platform Model InferenceGPU Programming and OptimizationHardware-Specific AccelerationLarge Language Model DeploymentML Compilation and OptimizationMobile AI DeploymentQuantization TechniquesRuntime OptimizationWebAssembly IntegrationDeep LearningForkedGPU / CUDAInferenceJavaScriptLarge Language ModelsMachine LearningMobileOpenAIPython

Taxonomy

AI Trends

On-device AIEdge AIModel OptimizationCross-platform AI

category

Foundation ModelsInference & Serving

Deployment Context

Edge/MobileBrowser/WASMSelf-hostedConsumer Hardware

Modalities

Text

Skill Areas

Large Language Model DeploymentML Compilation and OptimizationCross-Platform Model InferenceGPU Programming and OptimizationWebAssembly IntegrationMobile AI DeploymentQuantization TechniquesRuntime OptimizationHardware-Specific Acceleration

tag

Deep LearningForkedGPU / CUDAInferenceJavaScriptLarge Language ModelsMachine LearningMobileOpenAIPython

Use Cases

Cross-platform LLM deploymentMobile AI application developmentWeb-based language model inferenceEdge computing with LLMsConsumer GPU optimization for AI modelsMulti-backend model serving

Recent Activity

Updated 2 months ago

7 Days

0

30 Days

0

90 Days

6

[FIX] Rename T.alloc_buffer to T.sblock_alloc_buffer (#3457)

Ruihang Lai • Mar 18, 2026

20d7fb3

[Bug] Replace alloc_buffer with sblock_alloc_buffer and temporarily bypass CSE (#3454)

Akaash Parthasarathy • Mar 18, 2026

2eb1f12

Feature/embedding/metadata abstraction (#3452)

Xijing Wang • Mar 17, 2026

39c5716

Quality

beta
Quality
high
Maturity
beta

Categories

Inference & ServingPrimaryOther AI / MLFoundation ModelsEdge & Mobile AI

PM Skills

Cost & Efficiency

Languages

Python100.0%

Timeline

Project created
Apr 29, 2023
Forked
Mar 22, 2026
Your last push
2 months ago
Upstream last push
22 days ago
Tracked since
Mar 18, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…