I am trying to do 10-fold cross-validation on an audio dataset. The audio is clipped into small segments, so we have multiple clips from the same file. To avoid overfitting, each audio is assigned to a particular fold. The file structure is similar to the UrbanSoundDataset. I am generating MFCC features for each fold and saving the features using the following code:
np.savez("{0}/{1}_mfcc".format(save_dir, "fold"+str(fold_id)), features=X,
labels=y)
The feature set for each fold is fixed to rows x 40 mfcc x 174 dimensions. For example, the dimension for fold 1 is (534, 40, 174), and the dimension for fold 2 is (538, 40, 174). When I load the fold values, I want to stack the 9 folds together for training. For example,if I stack fold1 and fold2 in the above example, I should have (1072,40,174) length array at the end of the stacking process.
How can I do that using numpy?
numpy.concatenateto concatenate the two arrays on the first dimension. $\endgroup$