I'm experimenting with TensorFlow 2.0 alpha and I've found that it works as expected when using Numpy arrays but when tf.data.Dataset is used, an input dimension error appears. I'm using the iris dataset as the simplest example to demonstrate this:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
import tensorflow as tf
from tensorflow.python import keras
iris = datasets.load_iris()
scl = StandardScaler()
ohe = OneHotEncoder(categories='auto')
data_norm = scl.fit_transform(iris.data)
data_target = ohe.fit_transform(iris.target.reshape(-1,1)).toarray()
train_data, val_data, train_target, val_target = train_test_split(data_norm, data_target, test_size=0.1)
train_data, test_data, train_target, test_target = train_test_split(train_data, train_target, test_size=0.2)
train_dataset = tf.data.Dataset.from_tensor_slices((train_data, train_target))
train_dataset.batch(32)
test_dataset = tf.data.Dataset.from_tensor_slices((test_data, test_target))
test_dataset.batch(32)
val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_target))
val_dataset.batch(32)
mdl = keras.Sequential([
keras.layers.Dense(16, input_dim=4, activation='relu'),
keras.layers.Dense(8, activation='relu'),
keras.layers.Dense(8, activation='relu'),
keras.layers.Dense(3, activation='sigmoid')]
)
mdl.compile(
optimizer=keras.optimizers.Adam(0.01),
loss=keras.losses.categorical_crossentropy,
metrics=[keras.metrics.categorical_accuracy]
)
history = mdl.fit(train_dataset, epochs=10, steps_per_epoch=15, validation_data=val_dataset)
and I get the following error:
ValueError: Error when checking input: expected dense_16_input to have shape (4,) but got array with shape (1,)
assuming that the dataset has only one dimension. If I pass input_dim=1 I get a different error:
InvalidArgumentError: Incompatible shapes: [3] vs. [4]
[[{{node metrics_5/categorical_accuracy/Equal}}]] [Op:__inference_keras_scratch_graph_8223]
What is the proper way to use tf.data.Dataset on a Keras model with Tensorflow 2.0?
datasetis not declared, and there is a strayreturnkeyword). Can you fix this and update the question?data_targetafter the hot encoding step?