6

I am now learning to build a TTS project based on Tacotron-2.

Here, the original code in save_wav(wav, path, sr) function has a step to save a numpy array to .wav file by using

wav *= 32767 / max(0.01, np.max(np.abs(wav)))
scipy.io.wavfile.write(path, hparams.sample_rate, wav.astype(np.int16))

However, after obtained a numpy array using wav *= 32767 / max(0.01, np.max(np.abs(wav))), I want to convert it to a .mp3 file so that it will be easier to send it back as streaming response.

Right now, I can convert .wav bytes object to a .mp3 file, but the problem is that I don't know how to convert the numpy array to a .wav bytes object.

I searched about it and found that it seems like I need to set a header for the numpy array, but in almost all posts that I looked into indicated using modules like scipy.io.wave and audioop, which will first save the numpy array to a .wav file and then with open('filename.wav', 'rb').

(This is the link for scipy.io.wavfile.write module, where the filename param should be string or open file handle which, from my understanding, the generated .wav file will be saved on disk.)

Could anyone give any suggestion on how to achieve this?

2 Answers 2

5

Use io.BytesIO

There is a much simpler and more convenient solution using a little hack creating i/o interface of bytes. We can use it like file for write and read:

import io
from scipy.io.wavfile import write

bytes_wav = bytes()
byte_io = io.BytesIO(bytes_wav)
write(byte_io, <audio_sr>, <audio_numpy_array>)
result_bytes = byte_io.read()

Use your data sample rate and values array instead of <audio_sr> and <audio_numpy_array>. You can operate with result_bytes as bytes of .wav file (as required).

P.S. Also check this simple gist of how to perform values array -> bytes -> values array for wav file.

Sign up to request clarification or add additional context in comments.

1 Comment

Just an fyi, if using similar methods and byte_io.read() returns nothing then you might need to seek to the beginning of the file with byte_io.seek(0) before reading.
0

I finally solved this problem by modifying and creating new modules based on scipy.io.wavfile.write and audio_segment.py of pydub.

Beside, when you want to do operation on wave/mp3 bytes without saving them as a .wav/.mp3 file (normally by using some handful APIs or python package module), you should manually add header for it. It will not be a too-tough task if you look into those excellent package source codes.

1 Comment

Are you able to provide the solution you mentioned? Some Wav files may have a non-standard header for example which makes this harder.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.