Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/OpenLLM
Library/OpenLLMForked

bentoml/OpenLLM

OpenLLM

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

View on GitHub↗Upstream bentoml/OpenLLM↗

Builder

bentoml

bentoml

bentoml • individual

Stars

12,337

Using upstream star count

Forks

811

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Apr 19, 2023

Project creation date

README Summary

[![License: Apache-2.0](https://img.shields.io/badge/License-Apache%202-green.svg)](https://github.com/bentoml/OpenLLM/blob/main/LICENSE) [![Releases](https://img.shields.io/pypi/v/openllm.svg?logo=pypi&label=PyPI&logoColor=gold)](https://pypi.org/project/openllm) [![CI](https://results.pre-commit.ci/badge/github/bentoml/OpenLLM/main.svg)](https://results.pre-commit.ci/latest/github/bentoml/OpenLLM/main) [![X](https://badgen.net/badge/icon/@bentomlai/000000?icon=twitter&label=Follow)](https://tw

Community Evaluation

Loading…

AI Dev Skills

Unmapped

API Gateway DesignCloud Computing ArchitectureContainerizationDistributed SystemsLarge Language Model DeploymentModel Inference OptimizationModel Serving InfrastructureREST API Development

Tags

API Gateway DesignCloud Computing ArchitectureContainerizationDistributed SystemsLarge Language Model DeploymentModel Inference OptimizationModel Serving InfrastructureREST API DevelopmentDeepSeekDockerForkedHuggingFaceKubernetesLLM ServingLarge Language ModelsLlamaLlamaIndexMLOpsMistralOllamaOpenAIPhiPythonQwenvLLM

Taxonomy

AI Trends

Open Source LLMsModel Serving InfrastructureAPI StandardizationCloud-native AIDemocratized AI Access

category

Foundation ModelsRAG & RetrievalInference & ServingMLOps & Infrastructure

Deployment Context

Cloud APISelf-hostedOn-premiseContainerized

Industries

Developer ToolsCloud ServicesAI Infrastructure

Modalities

Text

Skill Areas

Large Language Model DeploymentAPI Gateway DesignModel Serving InfrastructureCloud Computing ArchitectureContainerizationDistributed SystemsREST API DevelopmentModel Inference Optimization

tag

DeepSeekDockerForkedHuggingFaceKubernetesLLM ServingLarge Language ModelsLlamaLlamaIndexMLOpsMistralOllamaOpenAIPhiPythonQwenvLLM

Use Cases

LLM API HostingModel Serving as a ServiceOpenAI API Drop-in ReplacementMulti-model LLM DeploymentCloud-based Text GenerationScalable AI Inference

Recent Activity

Updated 2 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

beta
Quality
medium
Maturity
beta

Categories

RAG & RetrievalPrimaryInference & ServingMLOps & InfrastructureFoundation ModelsML Platform & InfrastructureSearch & KnowledgeOther AI / ML

PM Skills

Cost & EfficiencyScale & Reliability

Languages

Python100.0%

Timeline

Project created
Apr 19, 2023
Forked
Mar 22, 2026
Your last push
2 months ago
Upstream last push
23 days ago
Tracked since
Mar 16, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…