1

I accidentally forgot to convert some NumPy arrays to bytes objects when using PyAudio, but to my surprise it still played audio, even if it sounded a bit off. I wrote a little test script (see below) for playing 1 second of a 440Hz tone, and it seems like writing a NumPy array directly to a PyAudio Stream cuts that tone short.

Can anyone explain why this happens? I thought a NumPy array was a contiguous sequence of bytes with some header information about its dtype and strides, so I would've predicted that PyAudio played the full second of the tone after some garbled audio from the header, not cut the tone off.

# script segment
import pyaudio
import numpy as np
RATE = 48000

p = pyaudio.PyAudio()
stream = p.open(format = pyaudio.paFloat32, channels = 1, rate = RATE, output = True)

TONE = 440
SECONDS = 1
t = np.arange(0, 2*np.pi*TONE*SECONDS, 2*np.pi*TONE/RATE) 
sina = np.sin(t).astype(np.float32)
sinb = sina.tobytes()

# console commands segment
stream.write(sinb) # bytes object plays 1 second of 440Hz tone
stream.write(sina) # still plays 440Hz tone, but noticeably shorter than 1 second

1 Answer 1

3

The problem is more subtle than you describe. Your first call is passing a bytes array of size 192,000. The second call is passing a list of float32 values with size 48,000. pyaudio handles both of them, and passes the buffer to portaudio to be played.

However, when you opened pyaudio, you told it you were sending paFloat32 data, which has 4 bytes per sample. The pyaudio write handler takes the length of the array you gave it, and divides by the number of channels times the sample size to determine how many audio samples there are. In your second call, the length of the array is 48,000, which it divides by 4, and thereby tells portaudio "there are 12,000 samples here".

So, everyone understood the format, but were confused about the size. If you change the second call to

stream.write(sina, 48000)

then no one has to guess, and it works perfectly fine.

Sign up to request clarification or add additional context in comments.

3 Comments

Need clarification on 3 questions: 1. Are you saying that pyaudio knows to pass the numpy array's data directly? In other words, the header information (dtype, strides, size) isn't being fed into the audio output? 2. The data being passed to stream.write should be the same for the numpy array and bytes object, right? Just 48,000 * 4-byte samples. 3. If 2 is true, then the sole issue is that pyaudio calculates len(sina)/4 and plays only 12,000 * 4-byte samples out of the 48,000 available?
You can look at the source. pyaudio doesn't know anything about numpy. It just passes its incoming parameter through to the C interface for portaudio. That interface asks the object for its "buffer" (you can read about the buffer protocol docs.python.org/3/c-api/buffer.html). In the case of numpy, it return a pointer to the data. Yes, the sole issue is that pyaudio assumed that the incoming buffer had 1-byte elements, so it only told portaudio to play 1/4 of them.
Thanks for the help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.