I cannot implement inference with TensorRT context.execute_async_v3(...). There are many examples using context.execute_async_v2(…). However, v2 is now deprecated.
The TensorRT developer page says to: Specify buffers for inputs and outputs with "context.set_tensor_address(name, ptr)"
The API has "context.set_input_shape("name, tuple(input_batch.shape))" and "set_output_allocator()", but after days of mucking around I have got nowhere.
Can some please provide an example or suggestion.