forward() using Pytorch Lightning not giving consistent binary classification results for single VS multiple images

Question

I have trained a Variational Autoencoder (VAE) with an additional fully connected layer after the encoder for binary image classification. It is setup using PyTorch Lightning. The encoder / decoder is resnet18 from PyTorch Lightning Bolts repo.

from pl_bolts.models.autoencoders.components import (
    resnet18_encoder,
    resnet18_decoder
)

class VariationalAutoencoder(LightningModule):

...

    self.first_conv: bool = False
    self.maxpool1: bool = False
    self.enc_out_dim: int = 512
    self.encoder = resnet18_encoder(first_conv, maxpool1)
    self.fc_object_identity = nn.Linear(self.enc_out_dim, 1)


    def forward(self, x):
        x_encoded = self.encoder(x)
        mu = self.fc_mu(x_encoded)
        log_var = self.fc_var(x_encoded)
        p, q, z = self.sample(mu, log_var)

        x_classification_score = torch.sigmoid(self.fc_object_identity(x_encoded))

        return self.decoder(z), x_classification_score

variational_autoencoder = VariationalAutoencoder.load_from_checkpoint(
        checkpoint_path=str(checkpoint_file_path)
    )

with torch.no_grad():
    predicted_images, classification_score = variational_autoencoder(test_images)

The reconstructions work well for single images and multiple images when passed through forward(). However, when I pass multiple images to forward() I get different results for the classification score than if I pass a single image tensor:

# Image 1 (class=1) [1, 3, 64, 64]
x_classification_score = 0.9857

# Image 2 (class=0) [1, 3, 64, 64]
x_classification_score = 0.0175

# Image 1 and 2 [2, 3, 64, 64]
x_classification_score =[[0.8943],
                         [0.1736]]

Why is this happening?

Please provide the architecture for the encoder. You are probably not running the evaluation mode of PyTorch, hence results are different. See here for more info. — Szymon Maszke
– Szymon Maszke, Commented May 27, 2022 at 21:33
Ah, thanks @szymonmaszke that seems to be it. I have added variational_autoencoder.eval() before the with torch.no_grad(): line and the results are now consistent. So without eval() the network is changing its architecture between inferencing the first image and second one when passing multiple? — aktabit
– aktabit, Commented May 28, 2022 at 5:28

Witek Bobrowski · Accepted Answer · 2022-06-23 20:47:35Z

1

You are using resnet18 which has a torch.nn.BatchNorm2d layer.

Its behavior changes whether it is in train or eval mode. It calculates mean and variance across batch during training and hence its output is dependent on examples in this batch.

In evaluation mode mean and variance gathered during training via moving average are used which is batch independent, hence results are the same.

edited Jun 23, 2022 at 20:47

Witek Bobrowski

4,3191 gold badge25 silver badges39 bronze badges

answered May 28, 2022 at 7:27

Szymon Maszke

25.2k4 gold badges54 silver badges92 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

forward() using Pytorch Lightning not giving consistent binary classification results for single VS multiple images

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related