Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/optillm
Library/optillmForked

algorithmicsuperintelligence/optillm

optillm

Optimizing inference proxy for LLMs

View on GitHub↗Upstream algorithmicsuperintelligence/optillm↗

Builder

algorithmicsuperintelligence

algorithmicsuperintelligence

algorithmicsuperintelligence • individual

Stars

4,071

Using upstream star count

Forks

353

Using upstream fork count

Open Issues

0

Activity Score

0/100

3 commits in 30d

Created

—

README Summary

<p align="center"> <img src="optillm-logo.png" alt="OptiLLM Logo" width="400" /> </p>

Community Evaluation

Loading…

AI Dev Skills

No AI dev skills recorded.

Tags

AI AgentsAPIActiveAnthropic / ClaudeAutomationAzure AIBenchmarkingC++CLI ToolClaudeContext EngineeringDatabaseDeepSeekDockerEvalsForkedGPTGoogle AIHuggingFaceInferenceLLM ServingLarge Language ModelsLiteLLMLlamaLoRA / PEFTLong ContextMCPMMLUMem0MistralMulti-AgentNumPyOllamaOpenAIPlanning / CoTPrompt EngineeringPydanticPythonPython Web FrameworkQwenRAGReal-Time / StreamingResearch / PapersSecuritySimulationStructured OutputTransformersllama.cpp

Taxonomy

category

Foundation ModelsAI AgentsRAG & RetrievalModel TrainingEvals & BenchmarkingInference & ServingRoboticsMLOps & InfrastructureDev Tools & AutomationCloud & PlatformsLearning ResourcesIndustry: GamingSecurity & SafetyData Science & Analytics

tag

AI AgentsAPIActiveAnthropic / ClaudeAutomationAzure AIBenchmarkingC++CLI ToolClaudeContext EngineeringDatabaseDeepSeekDockerEvalsForkedGPTGoogle AIHuggingFaceInferenceLLM ServingLarge Language ModelsLiteLLMLlamaLoRA / PEFTLong ContextMCPMMLUMem0MistralMulti-AgentNumPyOllamaOpenAIPlanning / CoTPrompt EngineeringPydanticPythonPython Web FrameworkQwenRAGReal-Time / StreamingResearch / PapersSecuritySimulationStructured OutputTransformersllama.cpp

Recent Activity

Updated 27 days ago

7 Days

0

30 Days

3

90 Days

9

Merge pull request #305 from GoDiao/feature/compact-plugin

Asankhaya Sharma • May 7, 2026

df018d6

Replace gated test model with Qwen2.5-Coder-0.5B-Instruct, bump to 0.3.15

Asankhaya Sharma • May 7, 2026

86d797f

Add compact plugin for auto context compression (closes #249)

GODDiao • May 6, 2026

5c07f46

Quality

Quality signals are not available for this repo yet.

Categories

RAG & RetrievalPrimaryEvals & BenchmarkingInference & ServingMLOps & InfrastructureDev Tools & AutomationCloud & PlatformsLearning ResourcesIndustry: GamingSecurity & SafetyData Science & AnalyticsFoundation ModelsAI AgentsModel TrainingRoboticsSearch & KnowledgeOther AI / ML

PM Skills

Cost & EfficiencyScale & ReliabilityData & EvaluationProduct DiscoveryDeveloper PlatformAI-Native Architecture

Languages

Python100.0%

Timeline

Project created
—
Forked
May 14, 2026
Your last push
27 days ago
Upstream last push
27 days ago
Tracked since
May 7, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…