Library/llama-cpp-python
Library/llama-cpp-pythonForked

abetlen/llama-cpp-python

llama-cpp-python

Python bindings for llama.cpp

Builder

abetlen

abetlen

abetlen • individual

Stars

10,124

Using upstream star count

Forks

1,347

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Mar 23, 2023

Project creation date

README Summary

llama-cpp-python provides Python bindings for llama.cpp, enabling users to run large language models locally with CPU and GPU acceleration. The package offers both a high-level Python API and OpenAI-compatible web server functionality for easy integration with existing applications. It supports various model formats and provides efficient inference capabilities for LLMs without requiring cloud services.

AI Dev Skills

Unmapped

Large Language Model InferenceC++ Python BindingsLocal Model DeploymentCPU/GPU OptimizationQuantized Model SupportMemory-Efficient Inference

Tags

Large Language Model InferenceC++ Python BindingsLocal Model DeploymentCPU/GPU OptimizationQuantized Model SupportMemory-Efficient InferenceEdge AI DeploymentTextSelf-hostedCustom LLM IntegrationOn-device AILocal Text GenerationSmall Language ModelsLocal AI InfrastructureEdge/MobileOn-premisePrivate AI ApplicationsPrivacy-Preserving AIOffline Language ProcessingPython

Taxonomy

Recent Activity

Updated 8 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
medium
Maturity
production

Categories

Inference & ServingPrimaryML Platform & InfrastructureEdge & Mobile AIOther AI / MLFoundation Models

PM Skills

Scale & Reliability

Languages

Python100.0%

Timeline

Project created
Mar 23, 2023
Forked
Mar 22, 2026
Your last push
8 months ago
Upstream last push
8 days ago
Tracked since
Aug 15, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…