Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/llama-cpp-python
Library/llama-cpp-pythonForked

abetlen/llama-cpp-python

llama-cpp-python

Python bindings for llama.cpp

View on GitHub↗Upstream abetlen/llama-cpp-python↗

Builder

abetlen

abetlen

abetlen • individual

Stars

10,353

Using upstream star count

Forks

1,412

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Mar 23, 2023

Project creation date

README Summary

<p align="center"> <img src="https://raw.githubusercontent.com/abetlen/llama-cpp-python/main/docs/icon.svg" style="height: 5rem; width: 5rem"> </p>

Community Evaluation

Loading…

AI Dev Skills

Unmapped

CPU/GPU OptimizationC++ Python BindingsLarge Language Model InferenceLocal Model DeploymentMemory-Efficient InferenceQuantized Model Support

Tags

CPU/GPU OptimizationC++ Python BindingsLarge Language Model InferenceLocal Model DeploymentMemory-Efficient InferenceQuantized Model SupportAPIC++Context EngineeringDockerEmbeddingsForkedGPU / CUDAGemmaHuggingFaceLangChainLarge Language ModelsLlamaIndexMultimodal AIOpenAIPydanticPythonPython Web FrameworkQuantizationQwenSpeculative DecodingStructured OutputTool UseTransformersUnslothllama.cpp

Taxonomy

AI Trends

On-device AISmall Language ModelsLocal AI InfrastructurePrivacy-Preserving AI

category

Foundation ModelsAI AgentsRAG & RetrievalModel TrainingInference & ServingMLOps & InfrastructureDev Tools & Automation

Deployment Context

Self-hostedEdge/MobileOn-premise

Modalities

Text

Skill Areas

Large Language Model InferenceC++ Python BindingsLocal Model DeploymentCPU/GPU OptimizationQuantized Model SupportMemory-Efficient Inference

tag

APIC++Context EngineeringDockerEmbeddingsForkedGPU / CUDAGemmaHuggingFaceLangChainLarge Language ModelsLlamaIndexMultimodal AIOpenAIPydanticPythonPython Web FrameworkQuantizationQwenSpeculative DecodingStructured OutputTool UseTransformersUnslothllama.cpp

Use Cases

Local Text GenerationPrivate AI ApplicationsOffline Language ProcessingCustom LLM IntegrationEdge AI Deployment

Recent Activity

Updated 9 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
medium
Maturity
production

Categories

RAG & RetrievalPrimaryInference & ServingMLOps & InfrastructureDev Tools & AutomationFoundation ModelsAI AgentsModel TrainingMultimodal AISearch & KnowledgeOther AI / ML

PM Skills

Cost & EfficiencyUser ExperienceScale & ReliabilityProduct DiscoveryDeveloper PlatformAI-Native Architecture

Languages

Python100.0%

Timeline

Project created
Mar 23, 2023
Forked
Mar 22, 2026
Your last push
9 months ago
Upstream last push
16 days ago
Tracked since
Aug 15, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…