TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
This repository is tracked by Trending Repos. The badge upgrades automatically the moment it cracks the top 100 — set it once and forget it.
[](https://trending-repos.com/repositories/NVIDIA/TensorRT-LLM)<a href="https://trending-repos.com/repositories/NVIDIA/TensorRT-LLM"><img src="https://trending-repos.com/badge/NVIDIA/TensorRT-LLM.svg" alt="Trending Repos" /></a>https://trending-repos.com/badge/NVIDIA/TensorRT-LLM.svg