SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
This repository is tracked by Trending Repos. The badge upgrades automatically if it ever cracks the top 100.
<img src="https://trending-repos.com/badge/intel/neural-compressor.svg" alt="Trending Repos" />https://trending-repos.com/badge/intel/neural-compressor.svg