Can I do recognition from numpy array in python SpeechRecognition?

Question

I'm recording a numpy array dt and then writing it in .wav by code like this:

dt = np.int16(dt/np.max(np.abs(dt)) * 32767)
scipy.io.wavfile.write("tmp.wav", samplerate, dt)

after that I read it and recognize by code

import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile("tmp.wav") as source:
    audio_text = r.listen(source)
    return r.recognize_google(audio_text, language = lang)

Can I do recognition from numpy array without using wav? Cuz it takes excess time

anroesti · Accepted Answer · 2020-05-22 20:48:49Z

0

Assuming this is the module you are using, and according to its documentation, you can pass any file-like object to AudioFile(). File-like objects are objects that support read and write operations.

You should be able to stick the byte representation of the wav file into a io.BytesIO object, which supports these operations, and pass that into your speech recognition module. scipy.io.wavfile.write() supports writing to such file-like objects.

I don't have the package or any WAV files to test it, but let me know if something like this works:

wav_bytes = io.BytesIO()
scipy.io.wavfile.write(wav_bytes, samplerate, dt)
with sr.AudioFile(wav_bytes) as source:
    ...

answered May 22, 2020 at 20:48

anroesti

11.5k3 gold badges24 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Demetry Pascal Over a year ago

I know that I should transform numpy array to some object for SpeechRecognition but I donna how to it, whats way, which functions

anroesti Over a year ago

Well I‘m not going to write the code for you. Have you tried anything of what I suggested, using the BytesIO?

Demetry Pascal Over a year ago

It doesn't work. Scipy works only with ndarry, but here there is the answer how to play numpy array in pyaudio. New question is how to use PyAudio stream in SpeechRecognition methods

anroesti Over a year ago

I understand what you want to do. Numpy array -> SpeechRecognition method. What I'm telling you is you need to use a file-like object, such as BytesIO for that. This doesn't actually write any files, it's all in memory. The answer you linked does a similar thing; they also use file-like objects.

H_Barrio · Accepted Answer · 2022-03-11 16:35:01Z

0

You can create an audio data object first with AudioData, this is the source that the recognizer needs as a file-like object:

import io
from scipy.io.wavfile import write
import speech_recognition

byte_io = io.BytesIO(bytes())
write(byte_io, sr, audio_array)
result_bytes = byte_io.read()

audio_data = speech_recognition.AudioData(result_bytes, sr, 2)
r = speech_recognition.Recognizer()
text = r.recognize_google(audio_data)

audio_array is a 1-D numpy.ndarray with int16 values and sr is the sampling rate.

answered Mar 11, 2022 at 16:35

H_Barrio

989 bronze badges

Collectives™ on Stack Overflow

Can I do recognition from numpy array in python SpeechRecognition?

2 Answers 2

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related