casper-hansen/AutoAWQ
AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Builder

casper-hansen
casper-hansen • individual
Stars
2,319
Using upstream star count
Forks
298
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Aug 25, 2023
Project creation date
README Summary
AutoAWQ is a Python library that implements the AWQ (Activation-aware Weight Quantization) algorithm for efficient 4-bit quantization of large language models. It provides a 2x speedup during inference while maintaining model accuracy through activation-aware weight quantization techniques. The library offers easy-to-use APIs for quantizing and running inference on popular transformer models.
AI Dev Skills
Unmapped
Tags
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 11 months ago
7 Days
0
30 Days
0
90 Days
0
Quality
production- Quality
- high
- Maturity
- production
Categories
PM Skills
Languages
Timeline
- Project created
- Aug 25, 2023
- Forked
- Mar 22, 2026
- Your last push
- 11 months ago
- Upstream last push
- 11 months ago
- Tracked since
- May 11, 2025
Similar Repos
pgvector cosine similarity · $0
Loading…