1

Consider the following LeNet model for MNIST

import torch
from torch import nn
import torch.nn.functional as F

class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4*4*50, 500)
        self.fc2 = nn.Linear(500, 10)
        self.ceriation = nn.CrossEntropyLoss()
    def forward(self, x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(x)
        x = x.view(-1, 4*4*50)
        x = self.fc1(x)
        x = self.fc2(x)
        return x

Now, I use this model to do a single forward step on a batch of samples like

network=LeNet()
optimizer = torch.optim.SGD(self.network.parameters(), lr=0.001, momentum=0.9)
device = torch.device("cpu")
network.to(device)
network.train()
optimizer.zero_grad()
# X_batch= ... some batch of 50 samples pulled from a train_loader defined as
# torch.manual_seed(42)
# training_set = datasets.MNIST('./mnist_data', train=True, download=False, 
#                               transform=transforms.Compose([
#                                   transforms.ToTensor(),
#                                   transforms.Normalize((0.1307,), (0.3081,))]))
# train_loader = torch.utils.data.DataLoader(training_set, 
#                                            batch_size=50, 
#                                            shuffle=False)
logits = network(X_batch)

Note that shuffle=False and download=False for the loader since the data set is already downloaded and I don't want to shuffle. My problem is that if I run this code twice I will get different values for logits and I don't understand why since everything else seems to be unchanged. For an extra check, I also extract X_batch to a numpy array and verify that the batch of samples is exactly the same as of previous execution. I do this check with numpy.array_equal() function.

I really can't figure out what I am missing here unless there are precision issues.

1 Answer 1

2

The reason is because every time you run this code you call

network = LeNet()

and end up having different random initialization for the network's weights. If you set random seed before doing that, e.g. like this:

torch.manual_seed(42)
network = LeNet()

then you should get same results on first forward step given you use same data as input.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.