Read left channel of wav data into numpy array

Question

I'm using pyaudio to take input from a microphone or read a wav file, and analyze the stream while playing it. I want to only analyze the right channel if the input is stereo. I've been able to extract the data and convert to integers using loops:

        levels = []
        length = len(data)
        if channels == 1:
            for i in range(length//2):
                volume = abs(struct.unpack('<h', data[i:i+2])[0])
                levels.append(volume)
        elif channels == 2:
            for i in range(length//4):
                j = 4 * i + 2
                volume = abs(struct.unpack('<h', data[j:j+2])[0])
                levels.append(volume)

I think this working correctly, I know it runs without error on a laptop and Raspberry Pi 3, but it appears to consume too much time to run on a Raspberry Pi Zero when simultaneously streaming the output to a speaker. I figure that eliminating the loop and using numpy may help. I assume I need to use np.ndarray to do this, and the first parameter will be (CHUNK,) where CHUNK is my chunk size for analyzing the audio (I'm using 1024). And the format would be '<h', as in the struct code above, I think. But I'm at a loss as to how to code it correctly for each of the two cases (mono and right channel only for stereo). How do I create the numpy arrays for each of the two cases?

zvone · Accepted Answer · 2020-07-03 12:45:33Z

1

You are here reading 16-bit integers from a binary file. It seems that you are first reading the data into data variable with something like data = f.read(), which is here not visible. Then you do:

for i in range(length//2):
    volume = abs(struct.unpack('<h', data[i:i+2])[0])
    levels.append(volume)

BTW, that code is wrong, it shoud be abs(struct.unpack('<h', data[2*i:2*i+2])[0]), otherwise you are overlapping bytes from different values.

To do the same with numpy, you should just do this (instead of both f.read()and the whole loop):

data = np.fromfile(f, dtype='<i2')

This is over 100 times faster than the manual thing above in my test on 5 MB of data.

In the second case, you have interleaved left-right-left-right values. Again you can read them all (assuming you have enough memory) and then access only one half:

data = np.fromfile(f, dtype='<i2')
left = data[::2]
right = data[1::2]

This processes everything, even though you need just one half, but it is still much much faster.

EDIT: If the data not coming from a file, np.fromfile can be replaced with np.frombuffer. Then you have this:

channel_data = np.frombuffer(data, dtype='<i2')
if channels == 2:
    channel_data = channel_data[1::2]
levels = np.abs(channel_data)

edited Jul 3, 2020 at 12:45

answered Jul 2, 2020 at 22:01

zvone

19.5k5 gold badges53 silver badges85 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

ViennaMike Over a year ago

Thanks! The data is passed in chunks from pyaudio, which is taking the data EITHER from a wav file (and only passing the data part of the wav file contents to the variable "data" OR generating it from microphone input, and doing so in chunks. And I need the binary stream so that pyaudio can play it out. So I can't use the file read part. Assuming is have passed the variable "data", as in my original code, I should use: levels = np.frombuffer(data, dtype='<i2') correct? Then the rest follows as you posted for getting just the left or right channel.

zvone Over a year ago

@ViennaMike Exactly! I added that to the answer also, so you have a complete solution ;)

Collectives™ on Stack Overflow

Read left channel of wav data into numpy array

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related