hqq-quantization

7.6

324

Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when deploying with vLLM or HuggingFace Transformers.

quantization

7.6

Rating

Installs

AI & LLM

Quick Review

Excellent skill documentation for HQQ quantization. The description perfectly captures when to use this skill versus alternatives. The SKILL.md is exceptionally well-structured with clear sections covering installation, basic usage, core concepts, and multiple integration paths (HuggingFace, vLLM, PEFT). Provides comprehensive code examples for common workflows including quantization, serving, and fine-tuning. The skill addresses a genuine pain point - calibration-free quantization that would otherwise require extensive CLI token usage to coordinate multiple tools. Strong practical value with backend selection guides, best practices, and troubleshooting. Minor point: while highly useful, the underlying quantization techniques are established (not cutting-edge novel), but the skill packaging and multi-framework integration adds significant convenience value.