lyogavin/airllm
airllm
AirLLM 70B inference with single 4GB GPU
Builder

lyogavin
lyogavin • individual
Stars
14,778
Using upstream star count
Forks
1,487
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Jun 12, 2023
Project creation date
README Summary
AirLLM enables running large language models like 70B parameter models on consumer GPUs with limited VRAM by implementing memory-efficient inference techniques. It allows developers and researchers to run massive LLMs locally without requiring expensive high-memory GPUs, making advanced AI models accessible on single 4GB graphics cards.
AI Dev Skills
Unmapped
Large Language Model InferenceMemory-Efficient ComputingGPU Memory ManagementModel OptimizationTransformer ArchitectureLow-Resource AI Deployment
Tags
Large Language Model InferenceMemory-Efficient ComputingGPU Memory ManagementModel OptimizationTransformer ArchitectureLow-Resource AI DeploymentLarge Language Model OptimizationLocal LLM DeploymentLarge Language Model Inference on Consumer HardwareEdge/MobileCost-Effective AI DevelopmentSelf-hostedTextOn-device AIModel QuantizationDemocratized AI AccessHardware-Constrained Deep LearningResearch with Limited GPU ResourcesMemory-Efficient AISmall Language ModelsMemory-Efficient Model InferenceJupyter Notebook
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 1 months ago
7 Days
0
30 Days
0
90 Days
0
Quality
prototype- Quality
- medium
- Maturity
- prototype
Categories
Foundation ModelsPrimaryOther AI / MLInference & ServingData Science & AnalyticsEdge & Mobile AISearch & KnowledgeLearning Resources
PM Skills
Developer Platform
Languages
Jupyter Notebook100.0%
Timeline
- Project created
- Jun 12, 2023
- Forked
- Mar 16, 2026
- Your last push
- 1 months ago
- Upstream last push
- 1 months ago
- Tracked since
- Mar 10, 2026
Similar Repos
pgvector cosine similarity · $0
Loading…