Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/minimind
Library/minimindForked

jingyaogong/minimind

minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

View on GitHub↗Upstream jingyaogong/minimind↗

Builder

jingyaogong

jingyaogong

jingyaogong • individual

Stars

50,817

Using upstream star count

Forks

6,494

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jul 27, 2024

Project creation date

README Summary

* 此开源项目旨在完全从0开始,仅用3块钱成本 + 2小时!即可训练出仅为25.8M的超小语言模型**MiniMind**。 * **MiniMind**系列极其轻量,最小版本体积是 GPT-3 的 $\frac{1}{7000}$,力求做到最普通的个人GPU也可快速训练。 * 项目同时开源了大模型的极简结构-包含拓展共享混合专家(MoE)、数据集清洗、预训练(Pretrain)、监督微调(SFT)、LoRA微调、直接偏好优化(DPO)、强化学习训练(RLAIF: PPO/GRPO等)、模型蒸馏等全过程代码。 * **MiniMind**同时拓展了视觉多模态的VLM: [MiniMind-V](https://github.com/jingyaogong/minimind-v)。 * 项目所有核心算法代码均从0使用PyTorch原生重构!不依赖第三方库提供的抽象接口。 * 这不仅是大语言模型的全阶段开源复现,也是一个入门LLM的教程。 * 希望此项目能为所有人提供一个抛砖引玉的示例,一起感受创造的乐趣!推动更广泛AI社区的进步!

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Attention MechanismsDeep Learning Training LoopsGPT ImplementationLanguage Model TrainingNeural Network OptimizationSmall Language ModelsTokenizationTransformer Architecture

Tags

Attention MechanismsDeep Learning Training LoopsGPT ImplementationLanguage Model TrainingNeural Network OptimizationSmall Language ModelsTokenizationTransformer ArchitectureAI AgentsBenchmarkingC++DPOData ScienceData VisualizationDeepSeekDeepSpeedDistillationEmbeddingsEvalsForkedGPTGPU / CUDAGRPOHealthcare AIHuggingFaceKV CacheLLM ServingLM Eval HarnessLarge Language ModelsLlamaLoRA / PEFTMMLUMistralNumPyOllamaOpen WebUIOpenAIPandasPlanning / CoTPrivacyPrivacy-Preserving AIPrompt EngineeringPyTorchPythonQuantizationQwenRLHFReasoning ModelsReinforcement LearningResearch / PapersTRLTool UseTransformersVisualizationWeights & Biasesllama.cppvLLM

Taxonomy

AI Trends

Small Language ModelsEfficient TrainingEducational AILightweight Models

category

Foundation ModelsAI AgentsRAG & RetrievalModel TrainingEvals & BenchmarkingObservability & MonitoringInference & ServingLearning ResourcesIndustry: HealthcareSecurity & SafetyData Science & Analytics

Deployment Context

Self-hostedLocal Training

Industries

EducationResearch

Modalities

Text

Skill Areas

Transformer ArchitectureLanguage Model TrainingGPT ImplementationNeural Network OptimizationDeep Learning Training LoopsTokenizationAttention MechanismsSmall Language Models

tag

AI AgentsBenchmarkingC++DPOData ScienceData VisualizationDeepSeekDeepSpeedDistillationEmbeddingsEvalsForkedGPTGPU / CUDAGRPOHealthcare AIHuggingFaceKV CacheLLM ServingLM Eval HarnessLarge Language ModelsLlamaLoRA / PEFTMMLUMistralNumPyOllamaOpen WebUIOpenAIPandasPlanning / CoTPrivacyPrivacy-Preserving AIPrompt EngineeringPyTorchPythonQuantizationQwenRLHFReasoning ModelsReinforcement LearningResearch / PapersTRLTool UseTransformersVisualizationWeights & Biasesllama.cppvLLM

Use Cases

Educational Language Model TrainingProof-of-Concept Text GenerationResearch ExperimentationLearning Transformer ImplementationQuick Prototyping of Language Models

Recent Activity

Updated 2 months ago

7 Days

0

30 Days

0

90 Days

0

[update] empty_think_ratio

jingyaogong • Feb 6, 2026

349e74e

[update] empty_think_ratio

jingyaogong • Feb 5, 2026

288e1ac

[feat] data process

jingyaogong • Feb 5, 2026

ccc190d

Quality

research
Quality
medium
Maturity
research

Categories

RAG & RetrievalPrimaryEvals & BenchmarkingObservability & MonitoringInference & ServingLearning ResourcesIndustry: HealthcareSecurity & SafetyData Science & AnalyticsFoundation ModelsAI AgentsModel TrainingSafety & AlignmentHealthcare & BiologySearch & KnowledgeOther AI / ML

PM Skills

Cost & EfficiencySafety & AlignmentData & EvaluationProduct DiscoveryDeveloper PlatformAI-Native Architecture

Languages

Python100.0%

Timeline

Project created
Jul 27, 2024
Forked
Mar 23, 2026
Your last push
2 months ago
Upstream last push
18 days ago
Tracked since
Mar 23, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…