I'm deploying a SageMaker inference pipeline composed of two PyTorch models (model_1 and model_2), and I am wondering if it's possible to pass the same input to both the models composing the pipeline.
What I have in mind would work more or less as follows
Invoke the endpoint sending a binary encoded payload (namely
payload_ser), for example:client.invoke_endpoint(EndpointName=ENDPOINT, ContentType='application/x-npy', Body=payload_ser)The first model parses the payload with
inut_fnfunction, runs the predictor on it, and returns the output of the predictor. As a simplified example:def input_fn(request_body, request_content_type): if request_content_type == "application/x-npy": input = some_function_to_parse_input(request_body) return input def predict_fn(input_object, predictor): outputs = predictor(input_object) return outputs def output_fn(predictions, response_content_type): return json.dumps(predictions)The second model gets as payload both the original payload (
payload_ser) and the output of the previous model (predictions). Possibly, theinput_fnfunction would be used to parse the output of model_1 (as in the "standard case"), but I'd need some way to also make the original payload available to model_2. In this way, model_2 will use both the original payload and the output of model_1 to make the final prediction and return it to whoever invoked the endpoint.
Any idea if this is achievable?