How to convert Numpy Array into Keras ImageDataGenerator?

Question

In Kaggle I was given the input data folders.

#Training data
train_datagen = ImageDataGenerator(
        rescale=1./255,
        rotation_range=10,
        zoom_range=0.4,
        horizontal_flip=True,
        validation_split=0.01
        )

train_generator = train_datagen.flow_from_directory(
        '../input/chest-xray-covid19-pneumonia/Data/train',
        target_size=(256, 256),
        batch_size=32,
        class_mode='categorical',
        subset='training'
        )

I had to add some more images to this dataset. Hence I converted my train_generator to a NumPy (nd) array using this code.

x_train=np.concatenate([train_generator.next()[0] for i in range(train_generator.__len__())])
y_train=np.concatenate([train_generator.next()[1] for i in range(train_generator.__len__())])

Thanks to this

Now I have concatenated some more images to these image array

gan_images = np.concatenate((x_train,t_x), axis=0)
gan_labels = np.concatenate((y_train,t_y), axis=0)

Now how can I again convert back it to the train_generator format?

Type of train_generator is keras.preprocessing.image.DirectoryIterator

EDIT

As per suggestion, I tried

train_dataset = train_datagen.flow(x_train,y_train)
additional_gan_dataset = train_datagen.flow(t_x,t_y)
abc = np.concatenate((gan_dataset,train_dataset), axis=0)

OOM error in kaggle;

Another way I tried

dataset = train_datagen.flow(gan_images, gan_labels)

 history1 = model1.fit(dataset,validation_data=val_generator, verbose=1, epochs= 500,
                       callbacks=[early_stopping, reduce_lr , learning_rate_reduction]
                        )

It is working, but here accuracy is coming so poor, I am sure it has not properly merged. I had a total of 5094 images. I have created another 100 images. As these are prefetch datasets I am unable to understand by checking the length. len(train_dataset) is giving me 160 After merging it is giving me 163. How to fix these? How to understand properly with these prefetch datasets?

Your code is incorrect, it should be dataset = train_datagen.flow(gan_images, gan_labels) — Dr. Snoopy
– Dr. Snoopy, Commented Sep 28, 2022 at 20:55

Djinn · Accepted Answer · 2022-09-28 19:50:39Z

2

ImageDataGenerator.flow(x_array, y_array)

dataset = ImageDataGenerator.flow(gan_images, gan_labels)

Although, unless you need the methods of ImageDataGenerator or really need a dataset object, you can just pass the arrays to .fit().

answered Sep 28, 2022 at 19:50

Djinn

8566 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

XYZ Over a year ago

Hi @Djinn , I need to apply here. history = model3.fit(train_generator, validation_data=val_generator, verbose=1, epochs= 500, callbacks=[early_stopping, reduce_lr , learning_rate_reduction] ) Hence looking for this. I have applied the way you have told me. let's see.

Djinn Over a year ago

If that's all you need, with no processing on the dataset after creating it, you still don't necessarily need to convert it to a dataset object. That's just extra overhead. But you could also add images directly to the dataset too, I believe, without needing to convert to arrays.

XYZ Over a year ago

I tried a LOT. I generated the images using GAN. Unfortunately, I am not able to add those to the kaggle training set folder directly. Hence this approach!

Djinn Over a year ago

Place the new images in a dataset, then use dataset.concatenate(additional_dataset)

Djinn Over a year ago

ImageDataGenerator is not an object, it's a class, one with an object that you didn't initialize. Follow whatever guide you're following to create your datagens and try to match with the answer. It's exactly like what's in your question.

|

Collectives™ on Stack Overflow

How to convert Numpy Array into Keras ImageDataGenerator?

1 Answer 1

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related