Shared input in Sagemaker inference pipeline models

Question

I'm deploying a SageMaker inference pipeline composed of two PyTorch models (model_1 and model_2), and I am wondering if it's possible to pass the same input to both the models composing the pipeline.

What I have in mind would work more or less as follows

Invoke the endpoint sending a binary encoded payload (namely payload_ser), for example:

client.invoke_endpoint(EndpointName=ENDPOINT,
                       ContentType='application/x-npy',
                       Body=payload_ser)

The first model parses the payload with inut_fn function, runs the predictor on it, and returns the output of the predictor. As a simplified example:

def input_fn(request_body, request_content_type):
    if request_content_type == "application/x-npy":
        input = some_function_to_parse_input(request_body)
    return input

def predict_fn(input_object, predictor):
    outputs = predictor(input_object)
    return outputs

def output_fn(predictions, response_content_type):
    return json.dumps(predictions)

The second model gets as payload both the original payload (payload_ser) and the output of the previous model (predictions). Possibly, the input_fn function would be used to parse the output of model_1 (as in the "standard case"), but I'd need some way to also make the original payload available to model_2. In this way, model_2 will use both the original payload and the output of model_1 to make the final prediction and return it to whoever invoked the endpoint.

Any idea if this is achievable?

Gili Nachum · Accepted Answer · 2021-12-06 17:46:46Z

2

Sounds like you need an inference DAG. Amazon SageMaker Inference pipelines currently supports only a chain of handlers, where the output of handler N is the input for handler N+1.

You could change model1's predict_fn() to return both (input_object, outputs), and output_fn(). output_fn() will receive these two objects as the predictions, and will handle serializing both as json. model2's input_fn() will need to know how to parse this pair input.

Consider implementing this as a generic pipeline handling mechanism that adds the input to the model's output. This way you could reuse it for all models and pipelines.

You could allow the model to be deployed as a standalone model, and as a part of a pipeline, and apply the relevant input/output handling behavior that will be triggered by the presence of an environment variable (Environment dict), which you can specify when creating the inference pipelines model.

answered Dec 6, 2021 at 17:46

Gili Nachum

5,6385 gold badges33 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Fraccalo Over a year ago

Thanks for your reply. My problem with having model_1 dumping both the predictions and the original input as a json is that my input might be a serialised binary file sent with POSTMAN. And it seems to me that this type of byte-encoded files can't be dumped into a json object (or at least every time I tried a got an message saying that the object is not json-serialisable).

Fraccalo Over a year ago

The altarnative of converting the binary file (which is an image) into a python list before dumping it in a json file also doesn't work for me, because this makes the returning payload too big (larger than the maximum allowed by Sagemaker) and the inference pipeline returns an error when I try to invoke it...

Fraccalo Over a year ago

A good workaround would be to find a way to dump the original binary payload into the returned json payload from model_1, but I didn't find a good way of doing this due to the incopatibility between json and binary encoding that I mentioned above...

Gili Nachum Over a year ago

How about returning "application/x-npy" instead of Json, by model_1. it could be a two dim array containing the original numpy input and the prediction. In any case, SageMaker doesn't provide this functionality on your behalf.

Fraccalo Over a year ago

that could be an option, I'll give it a try and see how it goes. There are a few workarounds to this problem, but it's unfortunate that the simplest isn't really an option. Thanks for your input anyone!

Collectives™ on Stack Overflow

Shared input in Sagemaker inference pipeline models

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related