What exactly does the forward function output in Pytorch?

Question

This example is taken verbatim from the PyTorch Documentation. Now I do have some background on Deep Learning in general and know that it should be obvious that the forward call represents a forward pass, passing through different layers and finally reaching the end, with 10 outputs in this case, then you take the output of the forward pass and compute the loss using the loss function one defined. Now, I forgot what exactly the output from the forward() pass yields me in this scenario.

I thought that the last layer in a Neural Network should be some sort of activation function like sigmoid() or softmax(), but I did not see these being defined anywhere, furthermore, when I was doing a project now, I found out that softmax() is called later on. So I just want to clarify what exactly is the outputs = net(inputs) giving me, from this link, it seems to me by default the output of a PyTorch model's forward pass is logits?

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        print(outputs)
        break
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

There is no such thing as default output of a forward function in PyTorch. — Berriel
– Berriel, Commented Nov 24, 2020 at 15:21
When no layer with nonlinearity is added at the end of the network, then basically the output is a real valued scalar, vector or tensor. — alxyok
– alxyok, Commented Nov 24, 2020 at 22:54

Munsif Ali · Accepted Answer · 2025-05-19 05:55:43Z

12

it seems to me by default the output of a PyTorch model's forward pass is logits

As I can see from the forward pass, yes, your function is passing the raw output

def forward(self, x):
  x = self.pool(F.relu(self.conv1(x)))
  x = self.pool(F.relu(self.conv2(x)))
  x = x.view(-1, 16 * 5 * 5)
  x = F.relu(self.fc1(x))
  x = F.relu(self.fc2(x))
  x = self.fc3(x)
  return x

So, where is softmax? Right here:

criterion = nn.CrossEntropyLoss()

It's a bit masked, but inside this function is handled the softmax computation which, of course, works with the raw output of your last layer

This is softmax calculation:

where z_i are the raw outputs of the neural network

So, in conclusion, there is no activation function in your last input because it's handled by the nn.CrossEntropyLoss class

Answering what's the raw output that comes from nn.Linear: The raw output of a neural network layer is the linear combination of the values that come from the neurons of the previous layer

edited May 19 at 5:55

Munsif Ali

7,8977 gold badges33 silver badges65 bronze badges

answered Nov 24, 2020 at 15:16

Nikaido

4,6675 gold badges36 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

ilovewt Over a year ago

Thank you! So to say, that if my previous of the linear layer (last layer) has 20 neurons/output values, and my linear layer has 5 outputs/classes, I can expect the output of the linear layer to be an array with 5 values, each of which is the linear combination of the 20 values multiplied by the 20 weights + bias?

Nikaido Over a year ago

@ilovewt yes, that's correct. Then the raw output is combined in the loss with softmax to output probabilities

ilovewt Over a year ago

What I did for me to find the softmax predictions is something like: softmax_preds = torch.nn.Softmax(dim=1)(input=raw_outputs ).to('cpu').detach().numpy(). Because even though nn.CrossEntropyLoss() does incorporate softmax inside, all it does is give me the loss when I call loss = criterion(raw_outputs, labels). Is this right?

Nikaido Over a year ago

@ilovewt yes it is correct. Anyway, I suggest you to open a new question if you have any new problem/implementation issues that you didn't understand from the doc ( pytorch is very well documented :) pytorch.org/docs/stable/generated/torch.nn.Softmax.html, pytorch.org/tutorials/beginner/nlp/deep_learning_tutorial.html). This is because it's better to be on topic for your current question

Nikaido Over a year ago

feel free to tag me. Unfortunately I am not so expert of pytorch (I know better keras\tf :)). If I know the answer I'll help

|

Collectives™ on Stack Overflow

What exactly does the forward function output in Pytorch?

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related