intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python2,643 stars305 forksApache-2.0

auto-tuningawqfp4gptqint4int8knowledge-distillationlarge-language-modelslow-precisionmxformatpost-training-quantizationpruningquantizationquantization-aware-trainingsmoothquantsparsegptsparsity

Rankings

Daily

#5835

Weekly

#10223

Monthly

#10938

Engagement

Total stars

Daily new stars

Get badge

This repository is tracked by Trending Repos. The badge upgrades automatically if it ever cracks the top 100.

Markdown

![Trending Repos](https://trending-repos.com/badge/intel/neural-compressor.svg)

HTML

<img src="https://trending-repos.com/badge/intel/neural-compressor.svg" alt="Trending Repos" />

Raw URL

https://trending-repos.com/badge/intel/neural-compressor.svg

Back to trending