Since you say your images have different sizes, resize them as you read them from the directory, and then append them to trainImages.
I'm suggesting two options:
Option 1:
Modify loadFiles as follows
def loadFiles(path):
trainImages = []
for r, d, f in os.walk(path):
for file in f:
filepath = os.path.join(r, file)
img = cv2.imread(filepath, cv2.IMREAD_GRAYSCALE)
# Resizing image to size (28, 28)
img = cv2.resize(img, (28, 28), interpolation=cv2.INTER_CUBIC)
trainImages.append(img)
trainImagesNumpy = np.ndarray(trainImages)
return trainImagesNumpy
train = loadFiles(trainPath)
You can use other interpolation strategies for resizing. Check out OpenCV Python documentation.
Also, using os.path.join is good practice to join base directory path and file path, as it is OS independent. It automatically takes care of the filepath separators in Windows (backslash) or Unix/Linux (forward slash).
Refer: cv2.resize
Option 2: Use the ImageDataGenerator class in keras
There are two advantages to using this:
- It loads data in batches.
- You can perform data augmentation very easily using inbuilt parameters.
Organize your data into train, validation and test directories. Each of the directories must contain subdirectories for each of the n classes.
The directory tree will look as follows (say you are doing a binary classification of cats vs dogs):
.
├── test
│ ├── cats
│ └── dogs
├── train
│ ├── cats
│ └── dogs
└── validation
├── cats
└── dogs
Then initialize a data generator, rescale the images from 0-255 to 0-1 range if you desire.
datagen = keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
Then read the training, validation and test images as batches from the flow_from_directory method.
train = datagen.flow_from_directory('data/train', target_size=(28, 28), batch_size=32)
validation = datagen.flow_from_directory('data/validation', target_size=(28, 28), batch_size=32)
test = datagen.flow_from_directory('data/test', target_size=(28, 28), batch_size=32)
Once you've executed the above code, make sure it tells you it's found the correct number of images with the correct number of classes.
You can then pass train, validation and test batches directly to the fit method in your keras model. Make sure you specify the number of steps_per_epoch and validation_steps while training. This is because generators run forever, continuously generating images, so fit needs to know when to stop. Make sure you provide the steps argument to the predict method as well, for the same reason.
Refer:
Keras docs