PyTorch Serve: Custom handler not saving inference results

Question

I am creating a custom pytorch serve handler for my image enhancement GAN model. The server successfully loads the model but gives no output when I make a request. It neither shows an error in logs. Also, it is not logging the info I specified in my handler code.

handler.py

import cv2
import torch
import logging
import numpy as np
from PIL import Image
from typing import List
from ts.torch_handler.base_handler import BaseHandler
from GANModel import GAN
from LD import ldDetector
from fb import alc, pfb

# Define global variables for model paths
MODEL_PATH_FAC2 = 'landmarks.dat'
MODEL_PATH_RET = 'model.pth'
logger = logging.getLogger(__name__)



class ImageHandler(BaseHandler):
    def __init__(self):
        super(ImageHandler, self).__init__()
        self.initialized = False
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        self.fc_h = None
        self.fc_enh = None

    def initialize(self, context):
        logger.info("\n\n\n Initialized Successfully")
        self.fc_h = ldDetector(MODEL_PATH_FAC2)
        self.fc_enh = self.get_enhancement_model(self.device)
        self.initialized = True
        logger.info("Initialized Successfully \n\n\n")
    
    def get_enhancement_model(self, device):
        gan = GAN(
            device
        )

        ldnet = torch.load(MODEL_PATH_RET)
        name = 'params_ema' if 'params_ema' in ldnet else 'params'
        gan.load_state_dict(loadnet[name], strict=True)
        gan.eval()
        gan = gan.to(device)
        return gan

    def inference(self, image: Image.Image) -> torch.Tensor:
        logger.info("\n\n\n Inside Inference Function")
        logger.info("Type of received image(PIL Image) = ",type(image))
        input_tensor = self.preprocess(image)
        output_tensor = self.pi(input_tensor)
        output_image = self.postprocess(output_tensor)
        logger.info("Type of Output(Tensor) = ",type(output_image))
        return output_image

    def preprocess(self, data) -> Image.Image:
        logger.info("\n\n\n Inside preprocess Function")
        logger.info("Type of received data = ",type(data))
        if isinstance(data, bytes):
        # If data is bytes, assume it's the raw image content
            image = Image.open(io.BytesIO(data))
        else:
            # If data is not bytes, assume it's a file path
            image = Image.open(data)

        print("Preprocessing complete.")
        return image

    def postprocess(self, output: torch.Tensor) -> Image.Image:
        output_np = output.squeeze(0).detach().cpu().numpy().transpose(1, 2, 0)
        output_np = np.clip(output_np, 0, 255)
        # Convert NumPy array to PIL image
        output_image = Image.fromarray(output_np.astype(np.uint8))
        output_image.save("processed_output.jpg")
        return output_image

    def pi(self, input_tensor: torch.Tensor) -> torch.Tensor:
        restored_img, _, _ = self.poi(input_tensor)
        return restored_img

    def poi(self, input_tensor: torch.Tensor):
        img_np = input_tensor.squeeze(0).detach().cpu().numpy().transpose(1, 2, 0)
        face_landmarks, _ = self.fc_h.get_face_landmarks(img_np)
        face_count = len(face_landmarks)
        restored_img = None
        for face_landmark in face_landmarks:
            cropped_face, inverse_affine = alc()
            restored_face = self.fc_enh(torch.from_numpy(cropped_face)
            restored_face = restored_face.squeeze(0).detach().cpu().numpy().transpose(1, 2, 0)
            restored_img = pfb(img_np, restored_face, inverse_affine=inverse_affine)

        return torch.from_numpy(restored_img.transpose(2, 0, 1)).unsqueeze(0), [], []

command for creating .mar

torch-model-archiver --model-name facex --version 1.0 --model-file model.py --serialized-file model.pth --handler handler.py --extra-files landmarks.dat,GANModel.py,LD.py,Utils.py

command for running server

torchserve --ncs --start --model-store model_store --ts-config config.properties --models facex.mar

command for inference

curl -X POST http://127.0.0.1:8080/predictions/facex -T 0294.png

config.properties

grpc_inference_port=7000
grpc_management_port=7001

ts_log.log

2024-03-19T00:28:13,602 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties
2024-03-19T00:28:13,602 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties
2024-03-19T00:28:13,605 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2024-03-19T00:28:13,605 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2024-03-19T00:28:13,646 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration 
2024-03-19T00:28:13,646 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration 
2024-03-19T00:28:13,754 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.10.0

Temp directory: /tmp

Number of GPUs: 1
Number of CPUs: 16
Max heap size: 7988 M

Config file: config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082

Initial Models: facex.mar

Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: LOG
Disable system metrics: false

CPP log config: N/A
Model config: N/A
System metrics command: default
2024-03-19T00:28:13,754 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.10.0


Temp directory: /tmp

Number of GPUs: 1
Number of CPUs: 16
Max heap size: 7988 M

Config file: config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082

Initial Models: facex.mar

Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: LOG
Disable system metrics: false

CPP log config: N/A
Model config: N/A
System metrics command: default
2024-03-19T00:28:13,767 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2024-03-19T00:28:13,767 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2024-03-19T00:28:13,794 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: facex.mar
2024-03-19T00:28:13,794 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: facex.mar
2024-03-19T00:28:20,040 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model facex
2024-03-19T00:28:20,040 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model facex
2024-03-19T00:28:20,041 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model facex
2024-03-19T00:28:20,041 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model facex
2024-03-19T00:28:20,041 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model facex loaded.
2024-03-19T00:28:20,041 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model facex loaded.
2024-03-19T00:28:20,041 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: facex, count: 1
2024-03-19T00:28:20,041 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: facex, count: 1

2024-03-19T00:28:20,047 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2024-03-19T00:28:20,047 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2024-03-19T00:28:20,047 [DEBUG] W-9000-facex_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: 
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2024-03-19T00:28:20,083 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2024-03-19T00:28:20,083 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2024-03-19T00:28:20,083 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
2024-03-19T00:28:20,083 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082

Kai Feng · Accepted Answer · 2024-04-04 17:21:50Z

0

Not 100% sure if this is the only reason, but I think the postprocess function need to return a list, but you're returning a Pillow object (see https://github.com/pytorch/serve/blob/master/ts/torch_handler/base_handler.py#L375)

So can perhaps try

def postprocess(self, output: torch.Tensor) -> Image.Image:
    output_np = output.squeeze(0).detach().cpu().numpy().transpose(1, 2, 0)
    output_np = np.clip(output_np, 0, 255)
    # Convert NumPy array to PIL image
    output_image = Image.fromarray(output_np.astype(np.uint8))
    output_image.save("processed_output.jpg")
    return [output_image]

Also, if you want to return the processed image, you might want to encode it first (e.g: using base64). I don't think you can send over a Pillow object via http.

answered Apr 4, 2024 at 17:21

Kai Feng

611 silver badge5 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

PyTorch Serve: Custom handler not saving inference results

handler.py

command for creating .mar

command for running server

command for inference

config.properties

ts_log.log

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

handler.py

command for creating .mar

command for running server

command for inference

config.properties

ts_log.log

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related