Lightning Talk: Accelerated Inference in PyTorch 2.X with Torch...- George Stefanakis & Dheeraj Peri
Автор: PyTorch
Загружено: 24 окт. 2023 г.
Просмотров: 2 527 просмотров
Lightning Talk: Accelerated Inference in PyTorch 2.X with Torch-TensorRT - George Stefanakis & Dheeraj Peri, NVIDIA
Torch-TensorRT accelerates the inference of deep learning models in PyTorch targeting NVIDIA GPUs. Torch-TensorRT now leverages Dynamo, the graph capture technology introduced in PyTorch 2.0, to offer a new and more pythonic user experience as well as to upgrade the existing compilation workflow. The new user experience includes Just-In-Time compilation and support for arbitrary Python code (like dynamic control flow, complex I/O, and external libraries) used within your model, while still accelerating performance. A single line of code provides easy and robust acceleration of your model with full flexibility to configure the compilation process without ever leaving PyTorch: torch.compile(model, backend=”tensorrt”) The existing API has also been revamped to use Dynamo export under the hood, providing you with the same Ahead-of-Time whole-graph acceleration with fallback for custom operators and dynamic shape support as in previous versions: torch_tensorrt.compile(model, inputs=example_inputs) We will present descriptions of both paths as well as features coming soon. All of our work is open source and available at https://github.com/pytorch/TensorRT.

Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: