PyTorch multi-class: ValueError: Expected input batch_size (416) to match target batch_size (32)

Question

I have created a mutli-class classification neural network. Training, and validation iterators where created with BigBucketIterator method with fields {'text_normalized_tweet':TEXT, 'label': LABEL}

TEXT = a tweet LABEL = a float number (with 3 values: 0,1,2)

Below I execute a dummy example of my neural network:

import torch.nn as nn

class MultiClassClassifer(nn.Module):
  #define all the layers used in model
  def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim):
    
    #Constructor
    super(MultiClassClassifer, self).__init__()

    #embedding layer
    self.embedding = nn.Embedding(vocab_size, embedding_dim)

    #dense layer
    self.hiddenLayer = nn.Linear(embedding_dim, hidden_dim)

    #Batch normalization layer
    self.batchnorm = nn.BatchNorm1d(hidden_dim)

    #output layer
    self.output = nn.Linear(hidden_dim, output_dim)

    #activation layer
    self.act = nn.Softmax(dim=1) #2d-tensor

    #initialize weights of embedding layer
    self.init_weights()

  def init_weights(self):

    initrange = 1.0
    
    self.embedding.weight.data.uniform_(-initrange, initrange)
  
  def forward(self, text, text_lengths):

    embedded = self.embedding(text)

    #packed sequence
    packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_lengths, batch_first=True)

    tensor, batch_size = packed_embedded[0], packed_embedded[1]

    hidden_1 = self.batchnorm(self.hiddenLayer(tensor))

    return self.act(self.output(hidden_1))

Instantiate the model

INPUT_DIM = len(TEXT.vocab)
EMBEDDING_DIM = 100
HIDDEN_DIM = 64
OUTPUT_DIM = 3

model = MultiClassClassifer(INPUT_DIM, EMBEDDING_DIM, HIDDEN_DIM, OUTPUT_DIM)

When I call

text, text_lengths = batch.text_normalized_tweet
                
predictions = model(text, text_lengths).squeeze()

loss = criterion(predictions, batch.label)

it returns,

ValueError: Expected input batch_size (416) to match target batch_size (32).

model(text, text_lengths).squeeze() = torch.Size([416, 3])
batch.label = torch.Size([32])

I can see that the two objects have different sizes, but I have no clue how to fix this?

You may find the Google Colab notebook here

Shapes of each in, out tensor of my forward() method:

torch.Size([32, 10, 100]) #self.embedding(text)
torch.Size([320, 100]) #nn.utils.rnn.pack_padded_sequence(embedded, text_lengths, batch_first=True)
torch.Size([320, 64]) #self.batchnorm(self.hiddenLayer(tensor))
torch.Size([320, 3]) #self.act(self.output(hidden_1))

What's the dimension of model(text, text_lengths)? Why are you using squeeze()? — kkgarg
– kkgarg, Commented Dec 15, 2021 at 15:19
@kkgarg its torch.Size([416, 3]) ...i think sueeze can be omitted. I am new with PyTorch so not all keywords are familiar to me. I have posted the shapes at the end. — NikSp
– NikSp, Commented Dec 15, 2021 at 15:28
To debug, I'd start with noting down the dimensions after every step in the forward pass. Finally, the output of model(text, text_lengths) should be [32, 3] if your batch_size is 32 and the number of classes = 3. Try to refer to Pytorch documentation for the individual functions e.g. pytorch.org/docs/stable/generated/… — kkgarg
– kkgarg, Commented Dec 15, 2021 at 15:39
@kkgarg the hidden_1 = self.batchnorm(self.hiddenLayer(tensor)) has shape [32, 13, 3] and output = self.act(self.output(hidden_1)) has shape [32x13, 3] = [416,3] — NikSp
– NikSp, Commented Dec 15, 2021 at 15:43
@kkgarg please check my update with the shapes per tensor in forward() — NikSp
– NikSp, Commented Dec 15, 2021 at 15:59

kkgarg · Accepted Answer · 2021-12-15 16:39:46Z

1

You shouldn't be using the squeeze function after the forward pass, that doesn't make sense.

After removing the squeeze function, as you see, the shape of your final output is [320,3] whereas it is expecting [32,3]. One way to fix this is to average out the embeddings you obtain for each word after the self.Embedding function like shown below:

def forward(self, text, text_lengths):

    embedded = self.embedding(text)
    embedded = torch.mean(embedded, dim=1, keepdim=True)

    packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_lengths, batch_first=True)
    tensor, batch_size = packed_embedded[0], packed_embedded[1]

    hidden_1 = self.batchnorm(self.hiddenLayer(tensor))
    return self.act(self.output(hidden_1))

edited Dec 15, 2021 at 16:39

answered Dec 15, 2021 at 16:19

kkgarg

1,4161 gold badge14 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

NikSp Over a year ago

Indeed averaging the embeddings makes sense and seems to fix the problem. I will accepts and upvote your answer. Moving to the next step loss.backward() I receive this error: IndexError: select(): index 1 out of range for tensor of size [1, 32, 100] at dimension 0 Shall I create a new question to that or it's related to the embedding averaging.

kkgarg Over a year ago

That should be a separate question I guess

Collectives™ on Stack Overflow

PyTorch multi-class: ValueError: Expected input batch_size (416) to match target batch_size (32)

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related