A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
This repository is tracked by Trending Repos. The badge upgrades automatically if it ever cracks the top 100.
<img src="https://trending-repos.com/badge/NVIDIA/TransformerEngine.svg" alt="Trending Repos" />https://trending-repos.com/badge/NVIDIA/TransformerEngine.svg