Increase Input length in Sagemaker TorchServe Container

Question

I am using a custom inference script for a Huggingface embedding model in an AWS SageMaker TorchServe container. My script accepts JSON input in the following format:

{
  "inputs": ["chunk1", "chunk2", "chunk3", "chunk4", ...]
}

When I send a large number of chunks in this JSON, it appears to hit a size limit in the SageMaker TorchServe image, resulting in the following error:

io.netty.handler.codec.CorruptedFrameException: Message size exceed limit: 102970662

Is it possible to increase the size limit in TorchServe settings within AWS SageMaker? If so, how can this be achieved?

I did not find any documentation in aws. So I tried adding the properties file in the folder of model.tar.gz. But it did not work.

Dudelstein · Accepted Answer · 2024-07-19 20:10:34Z

1

I was able to get around by including env variables in the transformer definition like this (more in Github discussion):

transformer = huggingface_model.transformer(
    instance_count=1,
    output_path = s3_output,
    instance_type="ml.m5.xlarge",
    assemble_with="Line",
    max_payload=6,
    strategy='SingleRecord',
    env={'SAGEMAKER_MODEL_SERVER_TIMEOUT':'3600', 
         'TS_MAX_RESPONSE_SIZE':'2000000000',
         'TS_MAX_REQUEST_SIZE':'2000000000',
         'MMS_MAX_RESPONSE_SIZE':'2000000000',
         'MMS_MAX_REQUEST_SIZE':'2000000000'}
)

AWS warns that when sizes are too large, memory can overflow, so you might want to check that in CloudWatch.

answered Jul 19, 2024 at 20:10

Dudelstein

7147 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Increase Input length in Sagemaker TorchServe Container

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related