1

I have a dataset called MEL of shape (94824,) wherein most instances have shape (99, 13) but some have smaller shapes. It consists of (float) MEL frequencies. I'm trying to put all the values in an empty numpy matrix of shape (94824, 99 , 13). So some instances are left empty. Any suggestions?

MEL type = numpy.ndarray
for i in MEL type(i) = <class 'numpy.ndarray'>
for j in i type (j) = <class 'numpy.ndarray'>
3
  • 1
    I updated the answer. Please have a look and try the first half of the answer above the line separator Commented Dec 11, 2018 at 22:10
  • Thanks! It worked! I have a beautiful array of shape (85314, 99) now! Commented Dec 11, 2018 at 22:14
  • 1
    Cool, now you can train your network!! Commented Dec 11, 2018 at 22:16

1 Answer 1

1

Since your MEL array is not of homogeneous shape, first we need to filter out the arrays whose shape is common (i.e. (99, 13)). For this, we could use:

filtered = []
for arr in MEL:
    if arr.shape == (99, 13):
        filtered.append(arr)
    else:
        continue

Then we can initialize an array to hold the results. And then we can iterate over this filtered list of arrays and calculate the mean over axis 1 like:

averaged_arr = np.zeros((len(filtered), 99))

for idx, arr in enumerate(filtered):
    averaged_arr[idx] = np.mean(arr, axis=1)

This should compute the desired matrix.


Here is a demo to reproduce your setup, assuming all arrays of the same shape:

# inputs 

In [20]: MEL = np.empty(94824, dtype=np.object)

In [21]: for idx in range(94824):
    ...:     MEL[idx] = np.random.randn(99, 13)

# shape of the array of arrays
In [13]: MEL.shape
Out[13]: (94824,)

# shape of each array
In [15]: MEL[0].shape
Out[15]: (99, 13)

# to hold results
In [17]: averaged_arr = np.zeros((94824, 99))

# compute average
In [18]: for idx, arr in enumerate(MEL):
    ...:     averaged_arr[idx] = np.mean(arr, axis=1)

# check the shape of resultant array
In [19]: averaged_arr.shape
Out[19]: (94824, 99)
Sign up to request clarification or add additional context in comments.

12 Comments

Thanks for your response! I'm getting the following error: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-247-7a545beac940> in <module>() ----> 1 averaged_arr = np.zeros(94824, 99) TypeError: data type not understood
MEL type = numpy.ndarray
Now it gives me this error: ValueError: could not broadcast input array from shape (87) into shape (99) When I change the 99 in np.zeros to 87 it gives: ValueError: could not broadcast input array from shape (99) into shape (87)
I'm getting the value error stated above... ValueError: could not broadcast input array from shape (87) into shape (99) Apparently not all are shaped (99, 13) but some are shaped (87,13) I think.
Is it okay if you can get rid of the arrays which are not of shape (99, 13)? I think then the dataset would be small, but that's the only approach that comes to my mind atm
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.