Tensorflow returns ValueError with tf.data.Dataset object, but works fine with np.array

Question

I'm working on a digit classifier model using this Kaggle dataset: https://www.kaggle.com/c/digit-recognizer/data?select=test.csv

When fitting the model with np.array objects, it works fine, but I can't pass tensorflow ds objects. Here's my code using ds objects for train/validation data:

import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow import keras
from functools import partial


train_df = pd.read_csv('train.csv')

def prepare_data(features_df, labels_df, test_ratio=0.1, val_ratio=0.1):
    features = features_df.to_numpy().reshape(features_df.shape[0], 28, 28)
    features = features[..., np.newaxis]

    labels = labels_df.to_numpy()

    X_train, X_test, y_train, y_test = ms.train_test_split(
        features,
        labels,
        test_size=test_ratio
    )

    X_train, X_valid, y_train, y_valid = ms.train_test_split(
        X_train,
        y_train,
        test_size=val_ratio
    )

    train_ds = tf.data.Dataset.from_tensor_slices((X_train, y_train))
    train_ds = train_ds.shuffle(2048).repeat()

    valid_ds = tf.data.Dataset.from_tensor_slices((X_valid, y_valid))
    valid_ds = valid_ds.shuffle(512).repeat()

    test_ds = tf.data.Dataset.from_tensor_slices((
        X_test,
        y_test
    ))

    return train_ds, valid_ds, test_ds


DefaultConv2D = partial(keras.layers.Conv2D,
                        kernel_size=4, activation='relu', padding="SAME")

model = keras.models.Sequential([
    DefaultConv2D(filters=128, kernel_size=7, input_shape=[28, 28, 1]),
    keras.layers.MaxPooling2D(pool_size=2),
    DefaultConv2D(filters=128),
    keras.layers.MaxPooling2D(pool_size=2),
    DefaultConv2D(filters=256),
    keras.layers.MaxPooling2D(pool_size=2),
    keras.layers.Flatten(),
    keras.layers.Dense(units=128, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(units=64, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(units=10, activation='softmax'),
])

early_stopping = tf.keras.callbacks.EarlyStopping(
    monitor='val_accuracy',
    verbose=1,
    patience=20,
    mode='max',
    restore_best_weights=True
)

model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])
history = model.fit(
    train_ds,
    epochs=100,
    validation_data=valid_ds,
    callbacks=[early_stopping,],
    steps_per_epoch=64
)

I get this error message:

    ValueError: Input 0 of layer sequential_2 is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: [28, 28, 1]

But if I change the code to use np.array objects instead, it works just fine:

test_ratio=0.1
val_ratio=0.1

features = features_df.to_numpy().reshape(features_df.shape[0], 28, 28)
features = features[..., np.newaxis]

labels = labels_df.to_numpy()

X_train, X_test, y_train, y_test = ms.train_test_split(
    features,
    labels,
    test_size=test_ratio
)

X_train, X_valid, y_train, y_valid = ms.train_test_split(
    X_train,
    y_train,
    test_size=val_ratio
)


history = model.fit(
    X_train,
    y_train,
    epochs=100,
    validation_data=(X_valid, y_valid),
    callbacks=[early_stopping,],
    steps_per_epoch=64
)

I checked several similar questions, nothing worked so far.

Richard X · Accepted Answer · 2020-07-06 03:21:12Z

2

It seems that you forgot to add the .batch() method at the end of your tf.data.Dataset objects, since your error refers to the batch dimension. From what I understand, creating a tf.data.Dataset stores the data set as something similar to a python generator rather than storing the whole data set in memory. This means that the cardinality (number of data points) of the data set is unknown. When you pass in a number to steps_per_epoch when using a tf.data.Dataset, your model uses that number to take that many batch sized samples from your data set. It is unable to calculate ahead of time the size of batches since the cardinality is unknown. Since you haven't batched your data, it will take individual samples. When creating data as numpy arrays, you have a defined number of data points, so your model will be able to calculate the size of your batches and use that.

answered Jul 6, 2020 at 3:21

Richard X

1,1348 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Mehdi Zare Over a year ago

Thank you! It solved part of the problem. If I exclude validation data, it works, but when I include validation data with .batch(64), it returns a similar error message:

ValueError: Input 0 of layer sequential_3 is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: [28, 28, 1]

Richard X Over a year ago

After you create your validation data set, could you print it out? It should give you the shape and type of the data set.

Mehdi Zare Over a year ago

Yes, <RepeatDataset shapes: ((28, 28, 1), ()), types: (tf.float64, tf.int64)>

Mehdi Zare Over a year ago

I added validation_steps=64 to fit and it's working now. Setting batch on the dataset shouldn't be the same as passing it as a parameter to the fit function?

Richard X Over a year ago

That is strange. Usually when you add .batch(), you will get a data set with None as the first dimension. And yes, setting batch on the data set is different than passing it as a parameter in .fit().

|

Collectives™ on Stack Overflow

Tensorflow returns ValueError with tf.data.Dataset object, but works fine with np.array

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related