Input batch size doesn't match target batch size in CrossEntropyLoss function

Question

I've been trying to build a model from scratch to recognize handwritten digits from the MNIST dataset with the help of PyTorch and the DataLoader class from FastAI. So far, I've been using a linear model that has 784 inputs (a flattened grayscale 28 by 28 handwritten digit image tensor) and 10 outputs.

simple_linear = torch.nn.Linear(784, 10)

My training data is organized as such:

train_x = torch.cat([stacked_zeros, stacked_ones, stacked_twos, stacked_threes, 
                     stacked_fours, stacked_fives, stacked_sixes, stacked_sevens, 
                     stacked_eights, stacked_nines]).view(-1, 28*28)

train_y = torch.nn.functional.one_hot(tensor([0] * len(zeros) + [1] * len(ones) + [2] * len(twos) + 
                 [3] * len(threes) + [4] * len(fours) + [5] * len(fives) + 
                 [6] * len(sixes) + [7] * len(sevens) + [8] * len(eights) + 
                 [9] * len(nines)).unsqueeze(1))

My x variables have shape [784] while y variables are labeled using one-hot encoded vectors with [1, 10] shape.

The loss function I chose based on research is torch.nn.CrossEntropyLoss and the following code gives me an error:

mnist_loss = torch.nn.CrossEntropyLoss()
mnist_loss(simple_linear(train_x[0]), train_y[0])

ValueError                                Traceback (most recent call last)
<ipython-input-245-03f54a6a43fb> in <module>()
----> 1 tst(simple_linear(x), y)

8 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   2260     if input.size(0) != target.size(0):
   2261         raise ValueError('Expected input batch_size ({}) to match target batch_size ({}).'
-> 2262                          .format(input.size(0), target.size(0)))
   2263     if dim == 2:
   2264         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

ValueError: Expected input batch_size (1) to match target batch_size (10).

I've tried reshaping my x variables and y variables but I always get a similar error. How must my data be structured in order for the loss function to work?

Ivan · Accepted Answer · 2021-01-08 15:08:37Z

2

The torch.nn.CrossEntropyLoss function doesn't take targets as one-hot-encodings!

Just pass the label index, so basically:

train_y = torch.tensor([0] * len(zeros) + [1] * len(ones) + [2] * len(twos) + 
                 [3] * len(threes) + [4] * len(fours) + [5] * len(fives) + 
                 [6] * len(sixes) + [7] * len(sevens) + [8] * len(eights) + 
                 [9] * len(nines)).unsqueeze(1)

Here's a suggestion, you could write everything like:

dataset = [stacked_zeros, stacked_ones, stacked_twos, stacked_threes,
           stacked_fours, stacked_fives, stacked_sixes, stacked_sevens, 
           stacked_eights, stacked_nines]

train_x = torch.cat(dataset)
train_y = torch.tensor([[i]*d.size(0) for i, d in enumerate(dataset)])

edited Jan 8, 2021 at 15:08

answered Jan 8, 2021 at 15:02

Ivan

41.3k9 gold badges78 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

pedrolins Over a year ago

I am still encountering problems after these changes. Your suggestion regarding how to organize my train_y variable gave me an error: "ValueError: expected sequence of length 5923 at dim 1 (got 6742)". Also, even after your changes I'm still facing problems with the loss function. Now the error says: "Dimension out of range (expected to be in range of [-1, 0], but got 1)"

Ivan Over a year ago

Ok, forget my suggestion for now then. Can you print out the shapes of train_y and train_x?

pedrolins · Accepted Answer · 2021-01-09 11:41:25Z

0

The errors disappeared after I:

Removed the .unsqueeze(1) from my train_y
Instead of trying the loss function with a single x and a single y, passed a whole batch as an argument to the loss function

answered Jan 9, 2021 at 11:41

pedrolins

113 bronze badges

Collectives™ on Stack Overflow

Input batch size doesn't match target batch size in CrossEntropyLoss function

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related