Interpreting ffmpeg output in Python

Question

I started working in FFmpeg and I want to create a list that will contain start and end timestamps of silence intervals. I did print out these intervals using the FFmpeg but I need to format that output so it looks a bit more readable, so that is why I want to create a list out of it and then print it using a custom function. I know that I should go with regex here but I am not sure how should I write it nor how should I read the FFmpeg console output. My function for silence detection looks like:

def detect_silence_ffmpeg():
    command = r"ffmpeg -i audio.wav -af silencedetect=n=-40dB:d=0.5 -f null - "
    subprocess.call(command, shell=True)

And the output of this function on a 7 second long sample video is:

ffmpeg version git-2020-06-03-b6d7c4c Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9.3.1 (GCC) 20200523
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --disable-w32threads --enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
  libavutil      56. 49.100 / 56. 49.100
  libavcodec     58. 90.100 / 58. 90.100
  libavformat    58. 44.100 / 58. 44.100
  libavdevice    58.  9.103 / 58.  9.103
  libavfilter     7. 84.100 /  7. 84.100
  libswscale      5.  6.101 /  5.  6.101
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'audio.wav':
  Metadata:
    encoder         : Lavf58.44.100
  Duration: 00:00:07.34, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf58.44.100
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.90.100 pcm_s16le
[silencedetect @ 00000202fc71e680] silence_start: 0
[silencedetect @ 00000202fc71e680] silence_end: 1.16374 | silence_duration: 1.16374
[silencedetect @ 00000202fc71e680] silence_start: 1.94558
[silencedetect @ 00000202fc71e680] silence_end: 3.41345 | silence_duration: 1.46787
[silencedetect @ 00000202fc71e680] silence_start: 3.8578
[silencedetect @ 00000202fc71e680] silence_end: 5.84844 | silence_duration: 1.99063
[silencedetect @ 00000202fc71e680] silence_start: 6.43653
size=N/A time=00:00:07.33 bitrate=N/A speed= 308x    
video:0kB audio:1264kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 00000202fc71e680] silence_end: 7.33868 | silence_duration: 0.902154

And this code should be implemented on an hour or so long videos so I really need to find a way to format this output a bit better than this. That would be it, any help would be much appreciated :)

P.S: the idea is that this should work on Windows mainly, but if the cross-platform is possible too it would be great.

Yes, I wrote it in the post, the only thing that confuses me is how I should extract the output on the console as a string or anything that I can actually apply regex on. — Luka Milivojevic
– Luka Milivojevic, Commented Jun 10, 2020 at 21:08
Ah, I thought the problem was parsing/interpreting. But it seems that wasn't the problem — Malte R
– Malte R, Commented Jun 11, 2020 at 16:03

Luka Milivojevic · Accepted Answer · 2020-06-10 22:10:58Z

2

I found an answer. The following can be done:

command = r"ffmpeg -i audio.wav -af silencedetect=n=-40dB:d=0.5 -f null - "
out = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

stdout, stderr = out.communicate()

this way the stdout variable will get become the string that represents the output on the console. Of course, this requires the subprocess module to be imported.

answered Jun 10, 2020 at 22:10

Luka Milivojevic

6910 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Interpreting ffmpeg output in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related