371 questions
Tooling
1
vote
2
replies
47
views
Is using TRT in Tensorflow 2.18+ possible?
I am aware that TensorFlow has announced that they will no longer support TRT. A while back the announced "Starting with TensorFlow 2.18, support for TensorRT will be dropped. TensorFlow 2.17 ...
0
votes
0
answers
83
views
TensorRT: enqueueV3 fails when using dynamic shapes and Green Contexts
I am trying to benchmark TensorRT inference using CUDA Green Contexts and splitting SMs. My code runs fine when I generate the .engine with fixed input shapes, but it fails when I build the engine ...
3
votes
1
answer
119
views
TensorRT PWC-Net Causing 2.4km Trajectory Error in iSLAM - Original PyTorch Works Fine
Problem Statement
My iSLAM system works correctly with the original PyTorch PWC-Net but produces catastrophic trajectory errors (2.4km ATE RMSE) when I replace it with a TensorRT-converted version. ...
0
votes
0
answers
148
views
TensorRT DLA Engine Build Fails for PWC-Net on Jetson NX - Missing Layer Support?
I'm converting a PWC-Net optical flow model to run on Jetson NX DLA using the iSLAM framework, but the TensorRT engine build fails during DLA optimization.
Environment
Hardware: NVIDIA Jetson NX
...
0
votes
1
answer
55
views
TensorRT new feature: how to add Debug Tensor?
In tensorRT > 10.8, there is a new feature to add the debug tensor while network is executing. https://docs.nvidia.com/deeplearning/tensorrt/10.8.0/inference-library/advanced.html#debug-tensors
I ...
0
votes
2
answers
79
views
Equivalent of venv's "--system-site-packages" in Anaconda environment
This is a basic question about Anaconda (Miniconda).
When using venv, I was able to import TensorRT by using "python3 -m venv </path/to/create_environment> --system-site-package". If I ...
0
votes
0
answers
119
views
TensorRT Access Violation Error (0xC0000005) at nvinfer_10.dll - How to Resolve?
Environment:
OS: Windows Operating System
TensorRT Version: TensorRT-10.3.0.26
NVIDIA CUDA Version: 12.6
cuDNN Version: 9.8
GPU: RTX 3050ti laptop GPU
Issue Description:
I am encountering an "...
0
votes
0
answers
103
views
Got Segmentation fault (core dumped) after run IExecutionContext.execute_async_v3()
I used the following commands to convert an ONNX model to a TRT engine, where the input.onnx file is the original model:
polygraphy surgeon sanitize --fold-constants ./input.onnx -o output.onnx
...
1
vote
0
answers
106
views
Error loading model using Torch TensorRT in Libtorch on Windows
Environment
Libtorch 2.5.0.dev (latest nightly) (built with CUDA 12.4)
CUDA 12.4
TensorRT 10.1.0.27
PyTorch 2.4.0+cu124
Torch-TensorRT 2.4.0
Python 3.12.8
Windows 10
Compile Torch-TensorRT with ...
0
votes
0
answers
280
views
Inference server with TensorRT - Error Code 1: CuTensor (Internal cuTensor permutate execute failed) Cuda Runtime (invalid resource handle)
I'm trying to implement a python inference server on jetson for remote image classification. I've generated an .engine via trt from an .onnx template, a client script sends an image for classification ...
0
votes
1
answer
226
views
Unable to install Deepstream SDK due to unmet dependencies
I am trying to install DeepStream 7.0 on Ubuntu 22.04
I installed Cuda and TensorRT
Cuda: 12.4
TensorRT: 8.6.1.6
However, I am getting this error while installing the .deb file - saying a bunch of ...
1
vote
0
answers
146
views
When I use C++ to load the custom TensorRT plugin of mmdploy, “segmentation error occurs“”
My env:
Ubuntu 20.04
CUDA 11.3
tensorRT 8.5.1.7
g++ 9.4
I can be sure that there is no problem with my libmmdeploy_tensorrt_ops.so, because I have loaded it in Python in the same environment and no ...
0
votes
0
answers
48
views
Can't install tensorrt [duplicate]
No matter which I've done, I can't seem to install tensorrt.
I'm on Manjaro, and installed cuda and cudnn using pacman. I'm attempting to install tensorrt from AUR (via sudo pamac install tensorrt) ...
0
votes
1
answer
124
views
Failed to install TensorRT-LLM by poetry add
OS: Ubuntu 22.04.1
Poetry version: 1.8.4
I want to install TensorRT-LLM by poetry, the command is
poetry add tensorrt-llm
But it raises errors:
Using version ^0.14.0 for tensorrt-llm
Updating ...
0
votes
0
answers
91
views
NVIDIA Jetson Orin FastAI2 model optimization with TensorRT and Torch2TRT incorrect Batch size
I have a Jetson Orin with the latest version of Jetpack 6.0 with CUDA 12 running on Ubuntu 22.04.
I have installed PyTorch and it has CUDA support installed:
Python 3.10.12 (main, Sep 11 2024, 15:47:...
1
vote
0
answers
94
views
How to do random sampling in TensorRT-LLM?
An LLM was fine-tuned to generate news headlines.
The inference is done either by using vLLM or TensorRT-LLM frameworks. Whereas vLLM produces highly diverse records (headlines) within a batch and in ...
1
vote
0
answers
394
views
Generate Dynamic batch size engine with tensorrt for DLA based CNN Inference
So I am new to using tensorrt, especially for DLA. I have a Resnet50 model which I am converting to ONNX format (using python). Then I use tensorrt CLI to get the engine file. Now, I want to execute ...
1
vote
1
answer
33
views
Some TensorRT conv layer forward blocked by cudaMemcpyAsync from another thread
See this nsys profile:
I have observed that during the forward pass of some layers in TensorRT execution, a lock is acquired before launching the kernel.
I attempted to determine the specific lock ...
0
votes
0
answers
346
views
Tensorrt installation issues in WSL
I'm trying to get tensorrt working with Tensorflow 2.17, yet after trying all of the official and unofficial instructions several times I still can't get it to work and I'm at the edge of sanity.
I've ...
1
vote
0
answers
109
views
"cuStreamSynchronize failed: an illegal memory access was encountered
When I run my tensorRT engine created from ONNX file of a single transformer layer, I faced an error:
Traceback (most recent call last):
File "/home/ipa/seokwon/mywork/llm/myproject/...
1
vote
1
answer
3k
views
Converting a PyTorch ONNX model to TensorRT engine - Jetson Orin Nano
I'm trying to convert a ViT-B/32 Vision Transformer model from the UNICOM repository on a Jetson Orin Nano. The model's Vision Transformer class and source code is here.
I use the following code to ...
1
vote
0
answers
136
views
High GPU memory usage with onnx model in tensorrt8
I have a code like this.
import pycuda.driver as cuda
import pycuda.autoinit
import tensorrt as trt
from cryptography.fernet import Fernet
import zipfile
import io
import os
TRT_LOGGER = trt.Logger(...
0
votes
0
answers
490
views
TensorRT inference with Triton Server Docker
I'm studying how to user the combination of tensorRT and triton. I'm working in this server: NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 Ubuntu 22.04 and I've ...
1
vote
0
answers
543
views
How do I set the TensorRT context for inference using execute_async_v3()
I cannot implement inference with TensorRT context.execute_async_v3(...). There are many examples using context.execute_async_v2(…). However, v2 is now deprecated.
The TensorRT developer page says to:...
1
vote
0
answers
141
views
Issues with Conversion of keypointrcnn_resnet50_fpn Torchvision Model from ONNX to TensorRT Engine
I am having a lot of difficulties converting a keypointrcnn_resnet50_fpn model from ONNX to TensorRT. I have done an extensive search on how to do so, and couldn't find anything that enables me to ...
0
votes
1
answer
780
views
Tensorflow can't find TensoRT
I'm using tensorflow 2.16.1 on Ubuntu 22.04, tensorrt 10.0.0b6, but when I import tensorflow I get the warning:
W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find ...
0
votes
1
answer
2k
views
[ LINUX ]Tensorflow-GPU not working - TF-TRT Warning: Could not find TensorRT
I have been struggling with downloading all the necessary drivers required for tesnorflow-gpu library. I want to compile my model using gpu instead of cpu. I am using Linux Mint. This is my neofetch
...
1
vote
0
answers
217
views
Torch cannot find cudnn_adv_train64_8.dll while building Tensor RT Engine for trt-llm-rag-windows
I am trying to install trt-llm-rag-windows following the guide on its github repo and encountered the issue while trying to build the trt engine with the following command:
python build.py --model_dir ...
1
vote
0
answers
808
views
Can I make a Huggingface trainer work with an Intel GPU?
I am trying to fine-tune a language model using the Huggingface libraries, following their guide (with another model and different data, but I don't think this is the crucial point). I am doing this ...
0
votes
0
answers
1k
views
AttributeError: 'RecursiveScriptModule' object has no attribute 'config' when use HF pipeline with TensorRT model
Step 1: I first traced a Roberta model and saved it.
batch_size = 4
batched_indexed_tokens = [[101, 64]*64]*batch_size
batched_attention_masks = [[1, 1]*64]*batch_size
tokens_tensor = torch.tensor(...
1
vote
1
answer
599
views
Inference speed isn't improved with tensor-rt compared to regular cuda
I'm trying to use the tensor-rt framework to enhance the inference speed of my deep learning model. I've created a very simple python code to test tensor-rt with pytorch.
import torch
import argparse
...
0
votes
1
answer
111
views
Why TF-TRT converter didn't work for my model?
I wanted to convert my trained model for better inference performance, by using TF-TRT.
I used the nvidia tensorflow docker image, and had no problem with running test code.
Test code is from here: ...
3
votes
0
answers
814
views
tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
I am getting this warning from tensorflow with the following code:
import tensorrt
import tensorflow as tf
tf.config.list_physical_devices('GPU')
I have looked at some of the posts about this but ...
0
votes
0
answers
181
views
pytorch convert a conv2d layer to tensorrt results in fp16 != fp32
I tried to convert a conv2d layer to TensorRT, and I found that with different params can result in different accuracy between fp16 and fp32. Anyone could give me some suggestions?
You can reproduce ...
1
vote
0
answers
488
views
infer using mixed precision in tensorrt
I'm currently using DETR for object detection. I want to convert it as follows:
pytorch -> onnx -> tensorrt I have the code to do so and tested the model achieving the same performance in all ...
0
votes
1
answer
995
views
CUDA Execution Provider in ONNX makes error where combining TensorRT with ONNX
Moving from CUDA ONNX to TensorRT code in Python
got this error while running a model from ONNX (with CUDA Provider) and model from TensorRT in the same code.
got this errors
2023-11-26 11:46:35....
1
vote
0
answers
697
views
I installed TensorRT and CUDNN on Windows for using with yoloV8. Error: Unable to load library: nvinfer_builder_resource.dll
Description
I installed TensorRT and CUDNN on Windows for using with yoloV8.
Following nvidia documentation (zip installation): TensorRT installation documentation
But when I ran this code on Python3....
1
vote
1
answer
2k
views
is it better to run yolov5 multiple times or with higher batch size
I need to process 6 images at once 10 times per second and i use yolov5 for this.
But I'm new in this topic and im a bit confused with batch sizes for Inference.
As far as I understood it, with higher ...
2
votes
0
answers
934
views
How to create an INT8 calibration table for the TensorRT execution provider of the ONNX runtime?
I exported a torch model to ONNX and want to run it with the ONNX runtime on an NVidia Jetson SoC. This works well with different backends (CPU, CUDA, and TensorRT) and different precisions (FP32 and ...
0
votes
0
answers
173
views
C++ - Stuck with YoloV4, ONNX and TensorRT
I'm doing some detection using YoloV4/C++/OpenCV and it's running pretty good.
Hower, to improve time consumption I'm trying to move everything to NVIDIA TensorRT and I'm feeling lost there.
I ...
2
votes
0
answers
513
views
Why does TensorRT enqueueV2 take longer time when using more isolated threads in C++?
OS : Windows 10
CUDA : version 11.5
TensorRT : 8.6.1.6
OpenCV : 4.8.0 built with CUDA
Driver version: Most recent Driver(545.84)
In my app, multiple cameras are going to be streamed. Each camera will ...
1
vote
1
answer
864
views
webAI-user.bat Python error message after installing TensorRT
I am running Stable Diffusion Automatic1111 on an Nvidia card with 12 GB of VRAM. I just completed the installation of TensorRT Extension. However, every time I launch webAI-user.bat I get the error ...
2
votes
2
answers
2k
views
TensorRT seems to lack common functionality
I've recently encountered such an amazing tool called tensorRT, but because I don't have NVIDIA GPU on my laptop, I decided to use Google Collab instead to play around with this technology.
I used ...
0
votes
0
answers
1k
views
How can I convert onnx model to engine model supporting a GPU with different compute capability on another pc
I’m using a laptop to convert an onnx model to engine model, and then run the engine model on a gpu.
My laptop’s GPU is “NVIDIA GeForce RTX 3060 Laptop GPU“, which’s compute capability is 8.6.”
The ...
0
votes
1
answer
959
views
Install tensorrt on google colab
I am trying to install tensorrt on my google collab notebook, i chose the GPU runtime type and ran the following command:
import os
import torch
when i run
torch.cuda.is_available()
it return "...
0
votes
1
answer
186
views
I can't find the right path Latent-diffusion Install to work on my GPU
first time asking usually find answers but now I am really stuck
I can't make it work again, I have tumbleweed, 4090gpu
when I run pip list I receive the following
.
.
nvidia-tensorrt 99....
2
votes
0
answers
864
views
YOLOX - Quantize int8 and convert to TensorRT engine
I have been trying to quantize YOLOX from float32 to int8. After that, I want that onnx output to be converted into TensorRT engine.
Quantization process seems OK, however I get several different ...
0
votes
2
answers
2k
views
importing tensorrt gives module not found error
The import of tenosrrt gives an error of module not found. Here are some commands I ran on my terminal. I am working on jetson xavier nx developer tool kit. Tensorrt is installed by default with ...
0
votes
1
answer
2k
views
Build engine TensorRT on Jetson Nano
I have this below code to build an engine (file engine with extension is .engine, not .trt) to use TensorRT on Jetson Nano. Although I configured engine file using FP16, when I run inference, I could ...
0
votes
1
answer
528
views
Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"
I am trying to export .engine from onnx for the pretrained yolov8m model but get into trtexec issue. Note that I am targeting for a model supporting dynamic batch-size.
I got the onnx by following the ...