Library/loraxForked

predibase/lorax

lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Builder

predibase

predibase

predibase • individual

Stars

3,742

Using upstream star count

Forks

310

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Oct 20, 2023

Project creation date

README Summary

LoRAX is a multi-LoRA inference server that enables serving thousands of fine-tuned language models simultaneously with minimal overhead. It dynamically loads and unloads LoRA adapters on demand, allowing efficient scaling of personalized LLM deployments. The system is built on top of HuggingFace's text-generation-inference and provides APIs for serving multiple fine-tuned variants of base models.

AI Dev Skills

Unmapped

LoRA Fine-tuningLarge Language Model ServingModel Inference OptimizationDistributed SystemsParameter-Efficient Fine-tuningMulti-tenant ML SystemsGPU Memory ManagementTransformer ArchitectureModel QuantizationBatch Processing Optimization

Tags

LoRA Fine-tuningLarge Language Model ServingModel Inference OptimizationDistributed SystemsParameter-Efficient Fine-tuningMulti-tenant ML SystemsGPU Memory ManagementTransformer ArchitectureModel QuantizationBatch Processing OptimizationOn-premiseSelf-hostedCloud APIDomain-specific model hostingMulti-tenant Model InferencePersonalized model deploymentTextScalable AI application backendsCost-efficient fine-tuned model inferenceDistributed Model ServingModel Serving InfrastructureMulti-tenant LLM servingCloud InfrastructureDeveloper ToolsModel Adapter ManagementDynamic Model LoadingEnterprise AIParameter Efficient Fine-tuningGPU Memory OptimizationMulti-tenant AI SystemsEfficient Model DeploymentPython

Taxonomy

Recent Activity

Updated 10 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

Other AI / MLPrimaryDev Tools & AutomationInference & ServingML Platform & InfrastructureFoundation ModelsModel Training

PM Skills

Developer Platform

Languages

Python100.0%

Timeline

Project created
Oct 20, 2023
Forked
Mar 22, 2026
Your last push
10 months ago
Upstream last push
10 months ago
Tracked since
May 21, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…