Library/mtp-lmForked

CelestialCreator/mtp-lm

mtp-lm

Source code to accompany research paper on training multi token prediction language models using self-distillation.

View on GitHub↗Upstream CelestialCreator/mtp-lm↗

Builder

CelestialCreator

CelestialCreator • individual

Stars

Using upstream star count

Forks

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Mar 5, 2026

Project creation date

README Summary

This repository contains the code for the arXiv preprint: [[2602.06019] Multi-Token Prediction via Self-Distillation](https://arxiv.org/abs/2602.06019)

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Deep Learning ResearchKnowledge DistillationLanguage Model TrainingLarge Language Model DevelopmentMulti-Token PredictionNeural Network OptimizationSelf-DistillationTransformer Architecture

Taxonomy

AI Trends

Model Efficiency Knowledge Distillation Language Model Innovation Research Reproducibility

Recent Activity

Updated 3 months ago

7 Days

30 Days

90 Days

Reproduce MTP self-distillation on single RTX 5090 (Llama-3.2-1B)

Akshay Mhaskar • Mar 5, 2026

13cb9d1

initial public release

jwkirchenbauer • Feb 21, 2026

167413e

Quality

research

Quality: medium
Maturity: research

PM Skills

Cost & EfficiencyData & EvaluationProduct DiscoveryDeveloper PlatformAI-Native Architecture

Languages

Python100.0%

Timeline

Project created: Mar 5, 2026
Forked: Mar 12, 2026
Your last push: 3 months ago
Upstream last push: 3 months ago
Tracked since: Mar 5, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

Library/mtp-lmForked

CelestialCreator/mtp-lm

mtp-lm

Source code to accompany research paper on training multi token prediction language models using self-distillation.

View on GitHub↗Upstream CelestialCreator/mtp-lm↗

Builder

CelestialCreator

CelestialCreator • individual

Stars

Using upstream star count

Forks

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Mar 5, 2026

Project creation date

README Summary

This repository contains the code for the arXiv preprint: [[2602.06019] Multi-Token Prediction via Self-Distillation](https://arxiv.org/abs/2602.06019)

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Deep Learning ResearchKnowledge DistillationLanguage Model TrainingLarge Language Model DevelopmentMulti-Token PredictionNeural Network OptimizationSelf-DistillationTransformer Architecture

Taxonomy

AI Trends

Model Efficiency Knowledge Distillation Language Model Innovation Research Reproducibility

Recent Activity

Updated 3 months ago

7 Days

30 Days

90 Days

Reproduce MTP self-distillation on single RTX 5090 (Llama-3.2-1B)

Akshay Mhaskar • Mar 5, 2026

13cb9d1

initial public release

jwkirchenbauer • Feb 21, 2026

167413e

Quality

research

Quality: medium
Maturity: research

PM Skills

Cost & EfficiencyData & EvaluationProduct DiscoveryDeveloper PlatformAI-Native Architecture

Languages

Python100.0%

Timeline

Project created: Mar 5, 2026
Forked: Mar 12, 2026
Your last push: 3 months ago
Upstream last push: 3 months ago
Tracked since: Mar 5, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

mtp-lm

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos

mtp-lm

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos