Library/vllmForked

vllm-project/vllm

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

View on GitHub↗Upstream vllm-project/vllm↗

Builder

vLLM

vllm-project • startup

Stars

81,406

Using upstream star count

Forks

17,421

Using upstream fork count

Open Issues

Activity Score

0/100

1138 commits in 30d

Created

Feb 9, 2023

Project creation date

README Summary

<p align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/vllm-project/vllm/main/docs/assets/logos/vllm-logo-text-dark.png"> <img alt="vLLM" src="https://raw.githubusercontent.com/vllm-project/vllm/main/docs/assets/logos/vllm-logo-text-light.png" width=55%> </picture> </p>

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Taxonomy

AI Trends

Large Language Models Model Serving Infrastructure Efficient AI Deployment Production AI Systems

Recent Activity

Updated 2 months ago

7 Days

237

30 Days

1138

90 Days

3769

Add ability to replace oot ops when using lora (#37181)

Kyuyeun Kim • Mar 17, 2026

0a0a1a1

Support non-contiguous KV cache in TRTLLM fp8 dequant kernel (#36867)

Vadim Gimpelson • Mar 17, 2026

6c1cfba

[BugFix] Correct max memory usage for multiple KV-cache groups (#36030)

Harry Huang • Mar 17, 2026

45f526d

Quality

production

Quality: high
Maturity: production

PM Skills

Cost & EfficiencyScale & ReliabilityProduct Discovery

Languages

Python100.0%

Timeline

Project created: Feb 9, 2023
Forked: Mar 13, 2026
Your last push: 2 months ago
Upstream last push: 15 days ago
Tracked since: Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

Library/vllmForked

vllm-project/vllm

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

View on GitHub↗Upstream vllm-project/vllm↗

Builder

vLLM

vllm-project • startup

Stars

81,406

Using upstream star count

Forks

17,421

Using upstream fork count

Open Issues

Activity Score

0/100

1138 commits in 30d

Created

Feb 9, 2023

Project creation date

README Summary

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Taxonomy

AI Trends

Large Language Models Model Serving Infrastructure Efficient AI Deployment Production AI Systems

Recent Activity

Updated 2 months ago

7 Days

237

30 Days

1138

90 Days

3769

Add ability to replace oot ops when using lora (#37181)

Kyuyeun Kim • Mar 17, 2026

0a0a1a1

Support non-contiguous KV cache in TRTLLM fp8 dequant kernel (#36867)

Vadim Gimpelson • Mar 17, 2026

6c1cfba

[BugFix] Correct max memory usage for multiple KV-cache groups (#36030)

Harry Huang • Mar 17, 2026

45f526d

Quality

production

Quality: high
Maturity: production

PM Skills

Cost & EfficiencyScale & ReliabilityProduct Discovery

Languages

Python100.0%

Timeline

Project created: Feb 9, 2023
Forked: Mar 13, 2026
Your last push: 2 months ago
Upstream last push: 15 days ago
Tracked since: Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

vllm

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos

vllm

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos