Detect Record Audio in Python

Question

I need to capture audio clips as WAV files that I can then pass to another bit of python for processing  The problem is that I need to determine when there is audio present and then record it  stop when it goes silent and then pass that file to the processing module   I m thinking it should be possible with the wave module to detect when there is pure silence and discard it then as soon as something other than silence is detected start recording  then when the line goes silent again stop the recording   Just can t quite get my head around it  can anyone get me started with a basic example

User · Answer

import pyaudio import wave from array import array  FORMAT pyaudio paInt16 CHANNELS 2 RATE 44100 CHUNK 1024 RECORD SECONDS 15 FILE NAME  RECORDING wav   audio pyaudio PyAudio    instantiate the pyaudio   recording prerequisites stream audio open format FORMAT channels CHANNELS                     rate RATE                    input True                    frames per buffer CHUNK    starting recording frames     for i in range 0 int RATE CHUNK RECORD SECONDS        data stream read CHUNK      data chunk array  h  data      vol max data chunk      if vol gt  500           print  something said           frames append data      else          print  nothing       print   n      end of recording stream stop stream   stream close   audio terminate    writing to file wavfile wave open FILE NAME  wb   wavfile setnchannels CHANNELS  wavfile setsampwidth audio get sample size FORMAT   wavfile setframerate RATE  wavfile writeframes b   join frames   append frames recorded to file wavfile close     I think this will help It is a simple script which will check if there is a silence or not If silence is detected it will not record otherwise it will record

User · Answer

I believe the WAVE module does not support recording  just processing existing files  You might want to look at PyAudio for actually recording  WAV is about the world s simplest file format  In paInt16 you just get a signed integer representing a level  and closer to 0 is quieter  I can t remember if WAV files are high byte first or low byte  but something like this ought to work  sorry  I m not really a python programmer   from array import array    you ll probably want to experiment on threshold   depends how noisy the signal threshold   10  max value   0  as ints   array  h   data  max value   max as ints  if max value  gt  threshold        not silence   PyAudio code for recording kept for reference   import pyaudio import sys  chunk   1024 FORMAT   pyaudio paInt16 CHANNELS   1 RATE   44100 RECORD SECONDS   5  p   pyaudio PyAudio    stream   p open format FORMAT                  channels CHANNELS                   rate RATE                   input True                  output True                  frames per buffer chunk   print    recording  for i in range 0  44100   chunk   RECORD SECONDS       data   stream read chunk        check for silence here by comparing the level with 0  or some threshold  for        the contents of data        then write data or not to a file  print    done   stream stop stream   stream close   p terminate

User · Answer

The pyaudio website has many examples that are pretty short and clear  http   people csail mit edu hubert pyaudio   Update 14th of December 2019 - Main example from the above linked website from 2017       PyAudio Example  Play a WAVE file      import pyaudio import wave import sys  CHUNK   1024  if len sys argv   lt  2      print  Plays a wave file  n nUsage   s filename wav    sys argv 0       sys exit -1   wf   wave open sys argv 1    rb    p   pyaudio PyAudio    stream   p open format p get format from width wf getsampwidth                     channels wf getnchannels                    rate wf getframerate                    output True   data   wf readframes CHUNK   while data            stream write data      data   wf readframes CHUNK   stream stop stream   stream close    p terminate

User · Answer

You might want to look at csounds  also   It has several API s  including Python   It might be able to interact with an A-D interface and gather sound samples

User · Answer

As a follow up to Nick Fortescue s answer  here s a more complete example of how to record from the microphone and process the resulting data   from sys import byteorder from array import array from struct import pack  import pyaudio import wave  THRESHOLD   500 CHUNK SIZE   1024 FORMAT   pyaudio paInt16 RATE   44100  def is silent snd data        Returns  True  if below the  silent  threshold      return max snd data   lt  THRESHOLD  def normalize snd data        Average the volume out      MAXIMUM   16384     times   float MAXIMUM  max abs i  for i in snd data       r   array  h       for i in snd data          r append int i times       return r  def trim snd data        Trim the blank spots at the start and end      def  trim snd data           snd started   False         r   array  h            for i in snd data              if not snd started and abs i  gt THRESHOLD                  snd started   True                 r append i               elif snd started                  r append i          return r        Trim to the left     snd data    trim snd data         Trim to the right     snd data reverse       snd data    trim snd data      snd data reverse       return snd data  def add silence snd data  seconds        Add silence to the start and end of  snd data  of length  seconds   float       silence    0    int seconds   RATE      r   array  h   silence      r extend snd data      r extend silence      return r  def record                Record a word or words from the microphone and      return the data as an array of signed shorts       Normalizes the audio  trims silence from the      start and end  and pads with 0 5 seconds of      blank sound to make sure VLC et al can play      it without getting chopped off              p   pyaudio PyAudio       stream   p open format FORMAT  channels 1  rate RATE          input True  output True          frames per buffer CHUNK SIZE       num silent   0     snd started   False      r   array  h        while 1            little endian  signed short         snd data   array  h   stream read CHUNK SIZE           if byteorder     big               snd data byteswap           r extend snd data           silent   is silent snd data           if silent and snd started              num silent    1         elif not silent and not snd started              snd started   True          if snd started and num silent  gt  30              break      sample width   p get sample size FORMAT      stream stop stream       stream close       p terminate        r   normalize r      r   trim r      r   add silence r  0 5      return sample width  r  def record to file path        Records from the microphone and outputs the resulting data to  path       sample width  data   record       data   pack   lt       h  len data     data       wf   wave open path   wb       wf setnchannels 1      wf setsampwidth sample width      wf setframerate RATE      wf writeframes data      wf close    if   name         main         print  please speak a word into the microphone       record to file  demo wav       print  done - result written to demo wav

User · Answer

Thanks to cryo for improved version that I based my tested code below    Instead of adding silence at start and end of recording  values 0  I add the original audio   This makes audio sound more natural as volume is  gt 0  See trim    I also fixed issue with the previous code - accumulated silence counter needs to be cleared once recording is resumed   from array import array from struct import pack from sys import byteorder import copy import pyaudio import wave  THRESHOLD   500    audio levels not normalised  CHUNK SIZE   1024 SILENT CHUNKS   3   44100   1024    about 3sec FORMAT   pyaudio paInt16 FRAME MAX VALUE   2    15 - 1 NORMALIZE MINUS ONE dB   10     -1 0   20  RATE   44100 CHANNELS   1 TRIM APPEND   RATE   4  def is silent data chunk          Returns  True  if below the  silent  threshold        return max data chunk   lt  THRESHOLD  def normalize data all          Amplify the volume out to max -1dB          MAXIMUM   16384     normalize factor    float NORMALIZE MINUS ONE dB   FRAME MAX VALUE                            max abs i  for i in data all        r   array  h       for i in data all          r append int i   normalize factor       return r  def trim data all        from   0      to   len data all  - 1     for i  b in enumerate data all           if abs b   gt  THRESHOLD               from   max 0  i - TRIM APPEND              break      for i  b in enumerate reversed data all            if abs b   gt  THRESHOLD               to   min len data all  - 1  len data all  - 1 - i   TRIM APPEND              break      return copy deepcopy data all  from   to   1     def record           Record a word or words from the microphone and      return the data as an array of signed shorts          p   pyaudio PyAudio       stream   p open format FORMAT  channels CHANNELS  rate RATE  input True  output True  frames per buffer CHUNK SIZE       silent chunks   0     audio started   False     data all   array  h        while True            little endian  signed short         data chunk   array  h   stream read CHUNK SIZE           if byteorder     big               data chunk byteswap           data all extend data chunk           silent   is silent data chunk           if audio started              if silent                  silent chunks    1                 if silent chunks  gt  SILENT CHUNKS                      break             else                   silent chunks   0         elif not silent              audio started   True                    sample width   p get sample size FORMAT      stream stop stream       stream close       p terminate        data all   trim data all     we trim before normalize as threshhold applies to un-normalized wave  as well as is silent   function      data all   normalize data all      return sample width  data all  def record to file path        Records from the microphone and outputs the resulting data to  path       sample width  data   record       data   pack   lt       h    len data     data       wave file   wave open path   wb       wave file setnchannels CHANNELS      wave file setsampwidth sample width      wave file setframerate RATE      wave file writeframes data      wave file close    if   name         main         print  Wait in silence to begin recording  wait in silence to terminate       record to file  demo wav       print  done - result written to demo wav

[python] Detect & Record Audio in Python

Examples related to python

Examples related to wav

Examples related to audio-recording