1

I'm deploying a SageMaker inference pipeline composed of two PyTorch models (model_1 and model_2), and I am wondering if it's possible to pass the same input to both the models composing the pipeline.

What I have in mind would work more or less as follows

  1. Invoke the endpoint sending a binary encoded payload (namely payload_ser), for example:

    client.invoke_endpoint(EndpointName=ENDPOINT,
                           ContentType='application/x-npy',
                           Body=payload_ser)
    
  2. The first model parses the payload with inut_fn function, runs the predictor on it, and returns the output of the predictor. As a simplified example:

    def input_fn(request_body, request_content_type):
        if request_content_type == "application/x-npy":
            input = some_function_to_parse_input(request_body)
        return input
    
    def predict_fn(input_object, predictor):
        outputs = predictor(input_object)
        return outputs
    
    def output_fn(predictions, response_content_type):
        return json.dumps(predictions)
    
  3. The second model gets as payload both the original payload (payload_ser) and the output of the previous model (predictions). Possibly, the input_fn function would be used to parse the output of model_1 (as in the "standard case"), but I'd need some way to also make the original payload available to model_2. In this way, model_2 will use both the original payload and the output of model_1 to make the final prediction and return it to whoever invoked the endpoint.

Any idea if this is achievable?

1 Answer 1

2

Sounds like you need an inference DAG. Amazon SageMaker Inference pipelines currently supports only a chain of handlers, where the output of handler N is the input for handler N+1.

You could change model1's predict_fn() to return both (input_object, outputs), and output_fn(). output_fn() will receive these two objects as the predictions, and will handle serializing both as json. model2's input_fn() will need to know how to parse this pair input.

Consider implementing this as a generic pipeline handling mechanism that adds the input to the model's output. This way you could reuse it for all models and pipelines.

You could allow the model to be deployed as a standalone model, and as a part of a pipeline, and apply the relevant input/output handling behavior that will be triggered by the presence of an environment variable (Environment dict), which you can specify when creating the inference pipelines model.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for your reply. My problem with having model_1 dumping both the predictions and the original input as a json is that my input might be a serialised binary file sent with POSTMAN. And it seems to me that this type of byte-encoded files can't be dumped into a json object (or at least every time I tried a got an message saying that the object is not json-serialisable).
The altarnative of converting the binary file (which is an image) into a python list before dumping it in a json file also doesn't work for me, because this makes the returning payload too big (larger than the maximum allowed by Sagemaker) and the inference pipeline returns an error when I try to invoke it...
A good workaround would be to find a way to dump the original binary payload into the returned json payload from model_1, but I didn't find a good way of doing this due to the incopatibility between json and binary encoding that I mentioned above...
How about returning "application/x-npy" instead of Json, by model_1. it could be a two dim array containing the original numpy input and the prediction. In any case, SageMaker doesn't provide this functionality on your behalf.
that could be an option, I'll give it a try and see how it goes. There are a few workarounds to this problem, but it's unfortunate that the simplest isn't really an option. Thanks for your input anyone!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.