Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/lorax
Library/loraxForked

predibase/lorax

lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

View on GitHub↗Upstream predibase/lorax↗

Builder

predibase

predibase

predibase • individual

Stars

3,785

Using upstream star count

Forks

316

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Oct 20, 2023

Project creation date

README Summary

<p align="center"> <a href="https://github.com/predibase/lorax"> <img src="docs/LoRAX_Main_Logo-Orange.png" alt="LoRAX Logo" style="width:200px;" /> </a> </p>

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Batch Processing OptimizationDistributed SystemsGPU Memory ManagementLarge Language Model ServingLoRA Fine-tuningModel Inference OptimizationModel QuantizationMulti-tenant ML SystemsParameter-Efficient Fine-tuningTransformer Architecture

Tags

Batch Processing OptimizationDistributed SystemsGPU Memory ManagementLarge Language Model ServingLoRA Fine-tuningModel Inference OptimizationModel QuantizationMulti-tenant ML SystemsParameter-Efficient Fine-tuningTransformer ArchitectureAI SafetyAPIBatchingDPODockerFine-TuningForkedGPU / CUDAHuggingFaceInferenceKubernetesLLM ServingLarge Language ModelsLoRA / PEFTMistralOpenAIPythonQuantizationQwenReal-Time / StreamingResearch / PapersRoadmapStructured OutputTGITutorialvLLM

Taxonomy

AI Trends

Parameter-Efficient Fine-tuningModel Serving InfrastructureMulti-tenant AI SystemsEfficient LLM Deployment

category

Inference & ServingFoundation ModelsAI AgentsModel TrainingMLOps & InfrastructureDev Tools & AutomationLearning ResourcesSecurity & Safety

Deployment Context

Cloud APISelf-hostedOn-premise

Modalities

Text

Skill Areas

LoRA Fine-tuningLarge Language Model ServingModel Inference OptimizationDistributed SystemsParameter-Efficient Fine-tuningMulti-tenant ML SystemsGPU Memory ManagementTransformer ArchitectureModel QuantizationBatch Processing Optimization

tag

AI SafetyAPIBatchingDPODockerFine-TuningForkedGPU / CUDAHuggingFaceInferenceKubernetesLLM ServingLarge Language ModelsLoRA / PEFTMistralOpenAIPythonQuantizationQwenReal-Time / StreamingResearch / PapersRoadmapStructured OutputTGITutorialvLLM

Use Cases

Multi-tenant LLM servingPersonalized AI assistants at scaleDomain-specific chatbot deploymentCustom model API endpointsEnterprise fine-tuned model hostingA/B testing multiple model variantsSpecialized task-specific language models

Recent Activity

Updated 1 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

Inference & ServingPrimaryMLOps & InfrastructureDev Tools & AutomationLearning ResourcesSecurity & SafetyFoundation ModelsAI AgentsModel TrainingSafety & AlignmentSearch & KnowledgeOther AI / ML

PM Skills

Cost & EfficiencySafety & AlignmentScale & ReliabilityDeveloper Platform

Languages

Python100.0%

Timeline

Project created
Oct 20, 2023
Forked
Mar 22, 2026
Your last push
1 years ago
Upstream last push
18 days ago
Tracked since
May 21, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…