5

I am new to Python, and I am trying to train my audio voice recognition model. I want to read a .wav file and get output of that .wav file into Numpy arrays. How can I do that?

1
  • 1
    Something like this: scipy.io.wavfile.read ? Commented Jan 13, 2019 at 23:31

2 Answers 2

7

In keeping with @Marco's comment, you can have a look at the Scipy library and, in particular, at scipy.io.

from scipy.io import wavfile

To read your file ('filename.wav'), simply do

output = wavfile.read('filename.wav')

This will output a tuple (which I named 'output'):

  • output[0], the sampling rate
  • output[1], the sample array you want to analyze
Sign up to request clarification or add additional context in comments.

2 Comments

from scipy.io import wavfile output = wavfile.read('filename.wav'), with this would i get the numpy-array??
Yes, output[1] is an np.array with sound amplitudes (with as many readings per second as indicated by your sample rate, output[0]).
5

This is possible with a few lines with wave (built in) and numpy (obviously). You don't need to use librosa, scipy or soundfile. The latest gave me problems reading wav files and it's the whole reason I'm writting here now.

import numpy as np
import wave

# Start opening the file with wave
with wave.open('filename.wav') as f:
    # Read the whole file into a buffer. If you are dealing with a large file
    # then you should read it in blocks and process them separately.
    buffer = f.readframes(f.getnframes())
    # Convert the buffer to a numpy array by checking the size of the sample
    # width in bytes. The output will be a 1D array with interleaved channels.
    interleaved = np.frombuffer(buffer, dtype=f'int{f.getsampwidth()*8}')
    # Reshape it into a 2D array separating the channels in columns.
    data = np.reshape(interleaved, (-1, f.getnchannels()))

I like to pack it into a function that returns the sampling frequency and works with pathlib.Path objects. In this way it can be played using sounddevice

# play_wav.py
import sounddevice as sd
import numpy as np
import wave

from typing import Tuple
from pathlib import Path


# Utility function that reads the whole `wav` file content into a numpy array
def wave_read(filename: Path) -> Tuple[np.ndarray, int]:
    with wave.open(str(filename), 'rb') as f:
        buffer = f.readframes(f.getnframes())
        inter = np.frombuffer(buffer, dtype=f'int{f.getsampwidth()*8}')
        return np.reshape(inter, (-1, f.getnchannels())), f.getframerate()


if __name__ == '__main__':
    # Play all files in the current directory
    for wav_file in Path().glob('*.wav'):
        print(f"Playing {wav_file}")
        data, fs = wave_read(wav_file)
        sd.play(data, samplerate=fs, blocking=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.