Library/gptqForked

IST-DASLab/gptq

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Builder

IST-DASLab

IST-DASLab

IST-DASLab • individual

Stars

2,284

Using upstream star count

Forks

194

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Oct 19, 2022

Project creation date

README Summary

GPTQ is a post-training quantization method for Generative Pretrained Transformers that achieves accurate 4-bit quantization without requiring retraining. The repository contains the implementation of the GPTQ algorithm presented at ICLR 2023, which can compress large language models while maintaining their performance. It provides efficient quantization techniques specifically designed for transformer architectures.

AI Dev Skills

Unmapped

Post-training QuantizationModel CompressionLarge Language Model OptimizationTransformer ArchitectureNeural Network PruningLow-precision ArithmeticGPU AccelerationMemory Optimization

Tags

Post-training QuantizationModel CompressionLarge Language Model OptimizationTransformer ArchitectureNeural Network PruningLow-precision ArithmeticGPU AccelerationMemory OptimizationSmall Language ModelsEdge/MobileEfficient AILarge Language Model DeploymentMobile AI ApplicationsOn-premiseCost-effective Model ServingSelf-hostedResearch in Model CompressionTextMemory-constrained InferenceEdge Computing AIOn-device AIPython

Taxonomy

Recent Activity

Updated 2 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
high
Maturity
research

Categories

Foundation ModelsPrimaryModel TrainingLearning ResourcesInference & ServingEdge & Mobile AISearch & KnowledgeOther AI / ML

PM Skills

Product Discovery

Languages

Python100.0%

Timeline

Project created
Oct 19, 2022
Forked
Mar 22, 2026
Your last push
2 years ago
Upstream last push
2 years ago
Tracked since
Mar 27, 2024

Similar Repos

pgvector cosine similarity · $0

Loading…