39 questions
0
votes
0
answers
59
views
SageMaker PyTorch MME ignores entry_point and falls back to default handler, causing ModelLoadError
I'm trying to deploy a custom PyTorch model to a SageMaker Multi-Model Endpoint (MME). My model is saved as a state_dict using torch.save(), so it requires a custom inference.py script to load the ...
0
votes
0
answers
99
views
How can I properly load a LoRA weight into a pretrained Stable Diffusion model on TorchServe and enable parallel inference?
I'm attempting to serve a pretrained Stable Diffusion model with LoRA weights applied using TorchServe. However, the LoRA weights don't seem to load properly, and I'm not sure why. Could anyone help ...
0
votes
1
answer
223
views
What is the preferred way to load images from s3 into torch serve for inference?
I have an image classifier model that I plan to deploy via torch serve. My question is, what is the ideal way to load as well write images from / to s3 buckets instead of from local filesystem for ...
2
votes
1
answer
253
views
Increase Input length in Sagemaker TorchServe Container
I am using a custom inference script for a Huggingface embedding model in an AWS SageMaker TorchServe container. My script accepts JSON input in the following format:
{
"inputs": ["...
1
vote
0
answers
320
views
Torchserve workflow is stuck for more than 14 python async requests
I am running torchserve container with pipeline and 2 models.
if I am sending python async requests to the pipeline, for more than 14 requests torchserve got stuck for long time and than fail.
But if ...
-1
votes
1
answer
199
views
How do I export my fastai resnet50/vision_learner trained model into torchserve?
My goal is to deploy a model I trained with Fastai into Torchserve. I was following this tutorial but got stuck on the part where he created the model class for pytorch.
He mentions that to run our ...
0
votes
1
answer
131
views
PyTorch Serve: Custom handler not saving inference results
I am creating a custom pytorch serve handler for my image enhancement GAN model. The server successfully loads the model but gives no output when I make a request. It neither shows an error in logs. ...
0
votes
1
answer
283
views
torchserve : batch_size is always 1 even config.properties specify other value
I think my torchserve loaded config.properties correctly because the number of worker is 2 as I set. But the batch_size is 1 instead of 20.
Anyone has an idea what might go wrong ? Thanks !
I have ...
0
votes
1
answer
120
views
It seems like metrics.yaml doesn't apply to my TorchServe service
I'm running TorchServe in WSL2. There are three issues with the metrics:
Even if metrics_config parameter in ts.config points to non-existing file everything works without any problems. It looks like ...
2
votes
1
answer
699
views
How to Implement Asynchronous Request Handling in TorchServe for High-Latency Inference Jobs?
I'm currently developing a Rails application that interacts with a TorchServe instance for machine learning inference. The TorchServe server is hosted on-premises and equipped with 4 GPUs. We're ...
0
votes
1
answer
313
views
How to I save an image or file using TorchServe?
I'm running a Yolov8 object detector with TorchServe. In my custom_handler, I'm trying to grab the detection output JSON and also get the image of the annotated bounding boxes.
When I run the code ...
0
votes
0
answers
422
views
Torchserve Error: number of batch response mismatched
We deployed NER Model with n1-standard-8 machine without GPU with below config properties. when we kept batch size as 1, it is taking more time to process the simultaneous requests. when we try to ...
0
votes
1
answer
619
views
Why is my torchserve docker image not working on google cloud run?
I have this docker image:
# syntax = docker/dockerfile:1.2
FROM continuumio/miniconda3
# install os dependencies
RUN mkdir -p /usr/share/man/man1
RUN apt-get update && \
DEBIAN_FRONTEND=...
1
vote
1
answer
946
views
How to create handler for huggingface model deployment using torchserve
I'm attempting to serve a pretrained huggingface model with torchserve and i've managed to save the model as a torchscript file (.pt). However, I do not know what the handler would look like for such ...
1
vote
1
answer
69
views
What the purpose of creating Python class inherited from `abc.ABC` but without `abstractmethod`?
I've read TorchServe's default handlers' sources and found that the BaseHandler is inherited from abc.ABC and doesn't have any abstract method. The VisionHandler is the same.
What could be the reason ...
1
vote
0
answers
911
views
TorchServe worker dies - Message size exceed limit
Every once in a while a TorchServe worker dies with the following message io.netty.handler.codec.CorruptedFrameException: Message size exceed limit: 16. When I rerun the request in question, it ...
1
vote
0
answers
314
views
Torchserve metrics on prometheus using kubernetes
I have a torchserve service running on kubernetes and I am already able to track metrics with it on port 8082. My problem is that from the kubernetes pod I can see it logs hardware metrics like:
[INFO ...
1
vote
1
answer
904
views
TorchServe is best practice for Vertex AI or overhead?
Currently, I am working with a PyTorch model locally using the following code:
from transformers import pipeline
classify_model = pipeline("zero-shot-classification", model='models/...
0
votes
1
answer
852
views
Torchserve custom handler - how to pass a list of tensors for batch inferencing
I am trying to create a custom handler in torchserve and want to also use torchserve's batch capability for parallelism for optimum use of resources. I am not able to find out how to write custom ...
2
votes
2
answers
418
views
Separate Python environment per model in TorchServe
Can TorchServe run separate Python environments for each model in TorchServe?
I have four models in production right now, and they all use transformers==3.1.0 package.
A new model about to be put in ...
1
vote
1
answer
742
views
Torchserve streaming of inference responses with gRPC
I am trying to send a singular request to a Torchserve server and retrieve a stream of responses. The processing of the request takes some time and I would like to receive intermeddiate updates over ...
1
vote
1
answer
632
views
Google Vertex AI Prediction: Why is TorchServe showing 0 GPUs?
I have deployed a trained PyTorch model to a Google Vertex AI Prediction endpoint. The endpoint is working fine, giving me predictions, but when I examine its logs in Logs Explorer, I see:
INFO 2023-...
0
votes
1
answer
948
views
torchserve model not running and giving a load of errors
I ran the following commands
torch-model-archiver --model-name "bert" --version 1.0 --serialized-file ./bert_model/pytorch_model.bin --extra-files "./bert_model/config.json,./bert_model/...
0
votes
1
answer
1k
views
A TFX equivalent for Pytorch [closed]
I'v recently worked with TensorFlow Extended(TFX) platform. Since my development background is on Pytorch stack, I'm looking for well-compatible alternatives of TFX for Pytorch.
While searching for ...
1
vote
0
answers
335
views
kserve updating from 0.7 to 0.9. My .mar file works on 0.7 but not on 0.9. Was able to run the example without issue on 0.9
I have been tasked with updating kserve from 0.7 to 0.9. Our company mar files run fine on 0.7 but when I update to kserve 0.9 the pods are brought up without issue. However, when I when a request is ...
1
vote
1
answer
1k
views
Extremely slow Bert inference on TorchServe for random requests
I have deployed Bert Hugging Face models via TorchServe on the AWS EC2 GPU instance.
There are enough resources provisioned, usage of everything is consistently below 50%.
TorchServe performs ...
8
votes
3
answers
7k
views
NVIDIA Triton vs TorchServe for SageMaker Inference
NVIDIA Triton vs TorchServe for SageMaker inference? When to recommend each?
Both are modern, production grade inference servers. TorchServe is the DLC default inference server for PyTorch models. ...
1
vote
1
answer
755
views
Why doesn't this python aiohttp requests code run asynchronously?
I'm trying to access an API with aiohttp but something is causing this code to block each iteration.
def main():
async with aiohttp.ClientSession() as session:
for i, (image, target) in ...
2
votes
1
answer
4k
views
How do I create a custom handler in torchserve?
I am trying to create a custom handler on Torchserve.
The custom handler has been modified as follows
# custom handler file
# model_handler.py
"""
ModelHandler defines a custom model ...
0
votes
1
answer
403
views
Cannot Mount Local Directory to torchserve docker container
Please bear with me as I am new to docker and have never used torchserve before, so any feedback will help. I am trying to create a .mar file in an existing docker container from a model.pt file, ...
2
votes
2
answers
2k
views
Logging in Custom Handler for TorchServe
I have written a custom handler for an DL model using torch-serve and am trying to understand how to add manual log messages to the handler. I know that I can simply print any messages and it will ...
-1
votes
1
answer
1k
views
Sagemaker pytorch inference stops at model call on gpu
I deployed a pytorch model using sagemaker and can successfully query it on a CPU. Deploying it on a GPU leads to a InternalServerError client-side though. Checking the CloudWatch Logs shows that the ...
1
vote
0
answers
760
views
Containerized Torchserve worker downloads new serialized file on start up
I am trying to build a container running torchserve with the pretrained fast-rcnn model for object detection in a all-in-one Dockerfile, based on this example:
https://github.com/pytorch/serve/tree/...
1
vote
1
answer
1k
views
TorchServe: How to convert bytes output to tensors
I have a model that is served using TorchServe. I'm communicating with the TorchServe server using gRPC. The final postprocess method of the custom handler defined returns a list which is converted ...
0
votes
1
answer
524
views
How can i create torch server on Google Colab and use prediction
I try to create a torchserve on google colab but it took forever to load and it seem that i can't even connect to the serve. Is this possible to create a torchserve on colab? Here is what it show when ...
0
votes
1
answer
113
views
Heroku torchserve app crashed, exited with status 0
I'm deploying torchserve on heroku free dyno. Deploy works fine but the app isn't running properly.
LOG1:
2022-03-17T02:11:10.352655+00:00 heroku[web.1]: Starting process with command `torchserve --...
0
votes
1
answer
2k
views
How can I register a local model.mar to a running torchserve service?
I have a running torchserve service. According to the docs, I can register a new model at port 8081 with the ManagementAPI. When running curl -X OPTIONS http://localhost:8081, the output also states ...
0
votes
1
answer
1k
views
Torchserve fails to load model on docker while it runs locally
I have a torchscript model (.pt) that I can successfully load and serve with torch serve on my local machine. On the other side, when trying to deploy it in the oficial torchserve docker it will ...
0
votes
1
answer
1k
views
Deployment with customer handler on Google Cloud Vertex AI
I'm trying to deploy a TorchServe instance on Google Vertex AI platform but as per their documentation (https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#...