Library/BentoML
Library/BentoMLForked

bentoml/BentoML

BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Builder

bentoml

bentoml

bentoml • individual

Stars

8,557

Using upstream star count

Forks

945

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Apr 2, 2019

Project creation date

README Summary

BentoML is a unified AI application framework that simplifies building, shipping, and scaling AI applications in production. It provides tools for creating model inference APIs, job queues, LLM applications, and multi-model pipelines with support for various ML frameworks. The platform offers features like auto-scaling, observability, and deployment across different cloud environments.

AI Dev Skills

Unmapped

Model Serving InfrastructureAPI Development for MLMulti-model Pipeline ArchitectureLarge Language Model DeploymentModel Inference OptimizationProduction ML SystemsContainerized ML DeploymentDistributed Model Serving

Tags

Model Serving InfrastructureAPI Development for MLMulti-model Pipeline ArchitectureLarge Language Model DeploymentModel Inference OptimizationProduction ML SystemsContainerized ML DeploymentDistributed Model ServingAI Application ContainerizationProduction AI SystemsImageLarge Language Model IntegrationAuto-scaling InfrastructureOn-premiseAPI Development for ML ModelsLLM DeploymentModel Inference API DevelopmentModel Serving and DeploymentTabularVideoProduction ML OperationsJob Queue ManagementTextMulti-Model Service ArchitectureModel Packaging and VersioningSelf-hostedProduction Model ServingServerlessContainer-basedCloud APILLM Application DeploymentBatch Job ProcessingMulti-Model Pipeline ArchitectureCompound AI SystemsAudioMultimodalMLOpsPython

Taxonomy

Recent Activity

Updated 28 days ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

MLOps & InfrastructurePrimaryDev Tools & AutomationInference & ServingML Platform & InfrastructureCoding & Dev ToolsMultimodal AIOther AI / MLGenerative MediaFoundation Models

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

Python100.0%

Timeline

Project created
Apr 2, 2019
Forked
Mar 22, 2026
Your last push
28 days ago
Upstream last push
7 days ago
Tracked since
Mar 16, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…