Library/inference
Library/inferenceForked

xorbitsai/inference

inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

Builder

xorbitsai

xorbitsai

xorbitsai • individual

Stars

9,191

Using upstream star count

Forks

812

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jun 14, 2023

Project creation date

README Summary

Xinference is a unified inference platform that allows developers to swap GPT models for any open-source LLM, speech, or multimodal model with just a single line of code change. It provides a production-ready API that can run models across different environments including cloud, on-premises, and local machines. The platform aims to simplify the process of experimenting with and deploying various AI models through a consistent interface.

AI Dev Skills

Unmapped

Large Language Model DeploymentModel Serving InfrastructureAPI Gateway DesignMulti-Model OrchestrationSpeech-to-Text IntegrationMultimodal AI SystemsDistributed Model InferenceProduction MLOps

Tags

Large Language Model DeploymentModel Serving InfrastructureAPI Gateway DesignMulti-Model OrchestrationSpeech-to-Text IntegrationMultimodal AI SystemsDistributed Model InferenceProduction MLOpsOpen Source LLMsModel Performance ComparisonDistributed AI SystemsOn-device AIMulti-Model ExperimentationProduction AI SystemsModel Abstraction LayersTextMultimodalLLM API StandardizationOn-premiseSpeech Recognition IntegrationModel Agnostic InfrastructureCloud APIUnified AI APIsProduction AI Model ServingAI Application DevelopmentAudioProduction ML OperationsMulti-Cloud AI DeploymentCross-Platform AI DeploymentEdge/MobileSelf-hostedPython

Taxonomy

Recent Activity

Updated 23 days ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

ML Platform & InfrastructurePrimaryMultimodal AIEdge & Mobile AIOther AI / MLDev Tools & AutomationLearning ResourcesInference & ServingGenerative MediaRoboticsFoundation Models

PM Skills

Developer Platform

Languages

Python100.0%

Timeline

Project created
Jun 14, 2023
Forked
Mar 22, 2026
Your last push
23 days ago
Upstream last push
7 days ago
Tracked since
Mar 21, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…