2

I am using a custom inference script for a Huggingface embedding model in an AWS SageMaker TorchServe container. My script accepts JSON input in the following format:

{
  "inputs": ["chunk1", "chunk2", "chunk3", "chunk4", ...]
}

When I send a large number of chunks in this JSON, it appears to hit a size limit in the SageMaker TorchServe image, resulting in the following error:

io.netty.handler.codec.CorruptedFrameException: Message size exceed limit: 102970662

Is it possible to increase the size limit in TorchServe settings within AWS SageMaker? If so, how can this be achieved?

I did not find any documentation in aws. So I tried adding the properties file in the folder of model.tar.gz. But it did not work.

1 Answer 1

1

I was able to get around by including env variables in the transformer definition like this (more in Github discussion):

transformer = huggingface_model.transformer(
    instance_count=1,
    output_path = s3_output,
    instance_type="ml.m5.xlarge",
    assemble_with="Line",
    max_payload=6,
    strategy='SingleRecord',
    env={'SAGEMAKER_MODEL_SERVER_TIMEOUT':'3600', 
         'TS_MAX_RESPONSE_SIZE':'2000000000',
         'TS_MAX_REQUEST_SIZE':'2000000000',
         'MMS_MAX_RESPONSE_SIZE':'2000000000',
         'MMS_MAX_REQUEST_SIZE':'2000000000'}
)

AWS warns that when sizes are too large, memory can overflow, so you might want to check that in CloudWatch.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.