Library/airllm
Library/airllmForked

lyogavin/airllm

airllm

AirLLM 70B inference with single 4GB GPU

Builder

lyogavin

lyogavin

lyogavin • individual

Stars

14,778

Using upstream star count

Forks

1,487

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jun 12, 2023

Project creation date

README Summary

AirLLM enables running large language models like 70B parameter models on consumer GPUs with limited VRAM by implementing memory-efficient inference techniques. It allows developers and researchers to run massive LLMs locally without requiring expensive high-memory GPUs, making advanced AI models accessible on single 4GB graphics cards.

AI Dev Skills

Unmapped

Large Language Model InferenceMemory-Efficient ComputingGPU Memory ManagementModel OptimizationTransformer ArchitectureLow-Resource AI Deployment

Tags

Large Language Model InferenceMemory-Efficient ComputingGPU Memory ManagementModel OptimizationTransformer ArchitectureLow-Resource AI DeploymentLarge Language Model OptimizationLocal LLM DeploymentLarge Language Model Inference on Consumer HardwareEdge/MobileCost-Effective AI DevelopmentSelf-hostedTextOn-device AIModel QuantizationDemocratized AI AccessHardware-Constrained Deep LearningResearch with Limited GPU ResourcesMemory-Efficient AISmall Language ModelsMemory-Efficient Model InferenceJupyter Notebook

Taxonomy

Recent Activity

Updated 1 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

prototype
Quality
medium
Maturity
prototype

Categories

Foundation ModelsPrimaryOther AI / MLInference & ServingData Science & AnalyticsEdge & Mobile AISearch & KnowledgeLearning Resources

PM Skills

Developer Platform

Languages

Jupyter Notebook100.0%

Timeline

Project created
Jun 12, 2023
Forked
Mar 16, 2026
Your last push
1 months ago
Upstream last push
1 months ago
Tracked since
Mar 10, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…