3

Trying to train a single layer NN for text based multi label classification problem.

model= Sequential()
model.add(Dense(20, input_dim=400, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(9, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')

model.fit(x_train, y_train, verbose=0, epochs=100)

Getting error as :

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

x_train is a 300-dim word2vec vectorized text data, each instance padded to 400 length. Contains 462 records.

Observations on training data are as below :

print('#### Shape of input numpy array #####')
print(x_train.shape)
print('#### Shape of each element in the array #####')
print(x_train[0].shape)
print('#### Object type for input data #####')
print(type(x_train))
print('##### Object type for first element of input data ####')
print(type(x_train[0]))

#### Shape of input numpy array #####
(462,)
#### Shape of each element in the array #####
(400, 300)
#### Object type for input data #####
<class 'numpy.ndarray'>
##### Object type for first element of input data ####
<class 'numpy.ndarray'>
3
  • I bet the dtype of x_train is object. Often is this results from making an array from a list of arrays that vary in shape. You show the shape of the first element. But is that the shape of all 462? What does np.stack(x_train) do? Commented Aug 7, 2021 at 5:00
  • 1
    print(type(x)) usually isn't very useful. print(x.dtype) is more informative. If dtype is object, then it contains objects, such as other arrays. If float it is a numeric array, and probably something tensorflow can use. Commented Aug 7, 2021 at 6:03
  • @hpaulj Yess the dtype was object. Commented Aug 7, 2021 at 18:10

1 Answer 1

7

There are three problems


problem1
This is your main problem, which directly caused the error.
something's wrong with how you initialize/convert your x_train (and I think it is a bug, or you used some unusual way to construct your data), now your x_train is in fact an array of array, instead of a multi-dimensional array. So TensorFlow "thought" you have a 1D array according to its shape, which is not what you want. the solution is to reconstruct the array before sending to fit():

x_train = np.array([np.array(val) for val in x_train])

problem2
Dense layer expects your input to have shape (batch_size, ..., input_dim), which means your last dimension of x_train must equal to input_dim, but you have 300, which is different from 400.
According to your description, your input dimension, which is the word vector dimension is 300, so you should change input_dim to 300:

model.add(Dense(20, input_dim=300, kernel_initializer='he_uniform', activation='relu'))

or equivalently, directly provide input_shape instead

model.add(Dense(20, input_shape=(400, 300), kernel_initializer='he_uniform', activation='relu'))

problem3
because dense, aka linear layer, is meant for "linear" input, so it expects each of its data to be a vector of one dimensional, so input is usually like (batch_size, vector_length). When dense receive an input of dimension > 2 (you got 3 dimensions), it will perform Dense operation on the last dimension. quote from TensorFlow official documentation:

Note: If the input to the layer has a rank greater than 2, then Dense computes the dot product between the inputs and the kernel along the last axis of the inputs and axis 1 of the kernel (using tf.tensordot). For example, if input has dimensions (batch_size, d0, d1), then we create a kernel with shape (d1, units), and the kernel operates along axis 2 of the input, on every sub-tensor of shape (1, 1, d1) (there are batch_size * d0 such sub-tensors). The output in this case will have shape (batch_size, d0, units).

This means your y should have shape (462, 400, 9) instead. which is most likely not what you are looking for (if this is indeed what you are looking for, code in problem1&2 should have solved your problem).

if you are looking for performing dense on the whole 400x300 matrix, you need to first flatten to a one-dimensional vector, like this:

x_train = np.array([np.array(val) for val in x_train])  # reconstruct
model= Sequential()
model.add(Flatten(input_shape=(400, 300)))
model.add(Dense(20, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(9, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')

model.fit(x_train, y_train, verbose=0, epochs=100)

now the output will be (462, 9)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.