I'm converting a PWC-Net optical flow model to run on Jetson NX DLA using the iSLAM framework, but the TensorRT engine build fails during DLA optimization.
Environment
- Hardware: NVIDIA Jetson NX
- Framework: iSLAM (PyTorch-based SLAM system)
- TensorRT: 8.2.1
- CUDA: 11.4
- Model: PWC-Net for optical flow estimation
File Structure
iSLAM/
├── models/stereo_cvt_tartanvo_1914.pkl
├── Network/
│ ├── convert_dla_final.py # Place conversion script here
│ ├── PWC.py
│ └── dla_module_wrapper.py # Place DLA wrapper here
Conversion Code
File: Network/convert_dla_final.py
def build_tensorrt_engine(onnx_path):
import tensorrt as trt
logger = trt.Logger(trt.Logger.WARNING)
builder = trt.Builder(logger)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
parser = trt.OnnxParser(network, logger)
with open(onnx_path, 'rb') as model_file:
parser.parse(model_file.read())
config = builder.create_builder_config()
config.max_workspace_size = 1 << 30
# DLA configuration - THIS IS WHERE IT FAILS
if builder.num_DLA_cores > 0:
config.default_device_type = trt.DeviceType.DLA
config.DLA_core = 0
config.flags |= 1 << int(trt.BuilderFlag.FP16)
config.flags |= 1 << int(trt.BuilderFlag.GPU_FALLBACK)
engine = builder.build_engine(network, config) # Returns None
return engine
Error Output
[TensorRT] ERROR: DLA does not support layer: PWN_/conv1a/Conv
[TensorRT] ERROR: DLA does not support layer: PWN_/leaky_relu_1/LeakyRelu
[TensorRT] ERROR: Network validation failed.
Model Architecture
The PWC-Net uses these layer types:
- Conv2d layers (conv1a through conv6a)
- LeakyReLU activations
- Correlation layers for cost volume
- Warping operations
- Upsampling layers
Specific Question
Which PWC-Net layers are incompatible with Jetson NX DLA, and how do I configure TensorRT to automatically fall back to GPU for unsupported operations while keeping supported layers on DLA?
The error suggests Conv2d and LeakyReLU should be supported on DLA, but the build still fails. I've enabled GPU_FALLBACK but the engine build returns None instead of a mixed DLA/GPU engine.
Expected Behavior
Engine should build successfully with DLA-compatible layers on DLA and incompatible layers automatically falling back to GPU execution.