2

I have a large .wav file array (200k samples) loaded in with scipy.io.wavfile. I tried to make a histogram of the data using matplotlib.pyplot hist with auto binning. It returned the error:

ValueError: Number of samples, -72, must be non-negative.

So I decided to set the bins myself using binwidth=1000:

min_bin = np.min(data[peaks])
max_bin = np.max(data[peaks])
plt.hist(data[peaks], bins=np.arange(min_bin,max_bin, binwidth))

When I do this, it gives the error:

RuntimeWarning: overflow encountered in short_scalars
from scipy.io import wavfile

Here are the type print outs of min_bin, max_bin, data:

Type min_bin: <class 'numpy.int16'> max_bin: <class 'numpy.int16'>
min_bin: -21231 max_bin: 32444
Type data <class 'numpy.ndarray'>

The problem seems to be with np.arange which fails when I provide it the bin range from the np.max and np.min .wav array values. When I manually type the max and min integer values into np.arange it has no problem. My hypothesis is that it is some sort of addressing error when referencing the .wav array but not sure how to fix it or why it is occurring.

2
  • Could you print out type(data), min_bin, max_bin, type(min_bin), and type(max_bin) and see what the output is? Commented Oct 21, 2019 at 2:33
  • Sorry for the delay, I have added the requested type print outs Commented Oct 22, 2019 at 4:20

1 Answer 1

3

As part of the computation of the length of the array, numpy.arange calculates stop - start, in Python object arithmetic. When stop and start are numpy.int16(32444) and numpy.int16(-21231), this subtraction overflows and produces numpy.int16(-11861). This is where the warning comes from. The nonsense value leads numpy.arange to believe that the result should be a length-0 array.

The workaround is simple; just convert the arguments to ints first. The dtype of the array itself can still be set to np.int16 to save space, since that's all you need to store the necessary data.

min_bin = int(np.min(data[peaks]))
max_bin = int(np.max(data[peaks]))
plt.hist(data[peaks], bins=np.arange(min_bin, max_bin, binwidth, dtype=np.int16))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.