1

How can I plot on matplotlib input signal from microphone? I have tried to plot with plt.plot(frames) but frames is for some reason a string?

a) Why is frames variable a string list?

b) Why is data variable string list?

c) Should they represent energy/amplitude of single sample and be integers?

d) Why is length of data 2048 when I specified I want chunk size of 1024?

(I guess because i use paInt16, but cannot see still why it couldn't be 1024)

I have the following code for microphone input:

import pyaudio
import audioop
import matplotlib.pyplot as plt
import numpy as np
from itertools import izip
import wave


FORMAT = pyaudio.paInt16                # We use 16bit format per sample
CHANNELS = 1
RATE = 44100
CHUNK = 1024                            # 1024bytes of data red from a buffer
RECORD_SECONDS = 3
WAVE_OUTPUT_FILENAME = "file.wav"

audio = pyaudio.PyAudio()

# start Recording
stream = audio.open(format=FORMAT,
                    channels=CHANNELS,
                    rate=RATE, input=True,
                    frames_per_buffer=CHUNK)

frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    data = stream.read(CHUNK)
    frames.append(data)
frames = ''.join(frames)

stream.stop_stream()
stream.close()
audio.terminate()
4
  • Can you include your import statements with your example code? Also, the frames variable is a string list because you declare it as so, frames = ''.join(frames). You don't need to do that, since you already appended all the frames you need, you have a list. Commented Dec 16, 2015 at 1:04
  • Yes, I am aware of that but data is also a string. I think there is something with struct.unpack that needs to be done, but I have no idea what exactly. Commented Dec 16, 2015 at 1:08
  • But you asked why frames was a list of strings? Anyway, Stream.read() is suppossed to return a string, as specified in the API Documentation: people.csail.mit.edu/hubert/pyaudio/docs Commented Dec 16, 2015 at 1:15
  • In python3 you need to put a b in front: b"".join(frames) Commented Nov 28, 2019 at 7:02

1 Answer 1

5

a) Why is frames variable a string list?

As a consequence of b), that's how you build it in your code.

b) Why is data variable string list?

It is a byte string, that is just a raw sequence of bytes. That's what read() returns.

c) Should they represent energy/amplitude of single sample and be integers?

They are. They're just packed in a byte sequence and not in Python integers.

d) Why is length of data 2048 when I specified I want chunk size of 1024?

1024 is the number of frames. Each frame is 2 bytes long, so you get 2048 bytes.

How can I plot on matplotlib input signal from microphone? I have tried to plot with plt.plot(frames) but frames is for some reason a string?

Depends on what you want to plot. Just raw amplitude can be obtained by transforming the byte string to a numpy array:

fig = plt.figure()
s = fig.add_subplot(111)
amplitude = numpy.fromstring(frames, numpy.int16)
s.plot(amplitude)
fig.savefig('t.png')

enter image description here

A more useful plot would be a spectrogram:

fig = plt.figure()
s = fig.add_subplot(111)
amplitude = numpy.fromstring(frames, numpy.int16)
s.specgram(amplitude)
fig.savefig('t.png')

enter image description here

But you can tinker with amplitude however you want, now that you have a numpy array.

Sign up to request clarification or add additional context in comments.

4 Comments

why did you have to declare the numpy array with dtype=int16?
numpy can't deduce from a raw byte string what kind of variables are stored there, you have to specify it explicitly. Since the data is read as pyaudio.paInt16, the string contains 2-byte signed integers, and that's what I tell numpy to read.
I see. I am not OP and I know nothing about audio, but is 16bit a standard sample size, meaning the measured intensity values have 16-bits of percision? And, is a "sample" simply one "unit" of sound, the duration of which produces one discrete intensity value?
Kind of standard, I guess, for non-professionals at least. I never used pyaudio myself, but I suspect it would convert the resolution to 16 bit if it wasn't 16-bit in the original file. Yes, a frame is a single measurement (taken 44100 times per second). It may contain several channels (stereo etc), but there's only one in this case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.