How get sound input from microphone in python and process it on the fly

Question

Greetings   I m trying to write a program in Python which would print a string every time it gets a tap in the microphone  When I say  tap   I mean a loud sudden noise or something similar   I searched in SO and found this post  Recognising tone of the audio  I think PyAudio library would fit my needs  but I m not quite sure how to make my program wait for an audio signal  realtime microphone monitoring   and when I got one how to process it  do I need to use Fourier Transform like it was instructed in the above post    Thank you in advance for any help you could give me

User · Answer

I know it s an old question  but if someone is looking here again    see https   python-sounddevice readthedocs io en 0 4 1 index html   It has a nice example  quot Input to Ouput Pass-Through quot  here https   python-sounddevice readthedocs io en 0 4 1 examples html input-to-output-pass-through       and a lot of other examples as well

User · Answer

If you are using LINUX  you can use pyALSAAUDIO   For windows  we have PyAudio and there is also a library called SoundAnalyse   I found an example for Linux here      usr bin python    This is an example of a simple sound capture script        The script opens an ALSA pcm for sound capture  Set    various attributes of the capture  and reads in a loop     Then prints the volume        To test it out  run it and shout at your microphone   import alsaaudio  time  audioop    Open the device in nonblocking capture mode  The last argument could   just as well have been zero for blocking mode  Then we could have   left out the sleep call in the bottom of the loop inp   alsaaudio PCM alsaaudio PCM CAPTURE alsaaudio PCM NONBLOCK     Set attributes  Mono  8000 Hz  16 bit little endian samples inp setchannels 1  inp setrate 8000  inp setformat alsaaudio PCM FORMAT S16 LE     The period size controls the internal number of frames per period    The significance of this parameter is documented in the ALSA api    For our purposes  it is suficcient to know that reads from the device   will return this many frames  Each frame being 2 bytes long    This means that the reads below will return either 320 bytes of data   or 0 bytes of data  The latter is possible because we are in nonblocking   mode  inp setperiodsize 160   while True        Read data from device     l data   inp read       if l            Return the maximum of the absolute value of all samples in a fragment          print audioop max data  2      time sleep  001

User · Answer

and when I got one how to process it  do I need to use Fourier Transform like it was instructed in the above post     If you want a  tap  then I think you are interested in amplitude more than frequency  So Fourier transforms probably aren t useful for your particular goal  You probably want to make a running measurement of the short-term  say 10 ms  amplitude of the input  and detect when it suddenly increases by a certain delta  You would need to tune the parameters of    what is the  short-term  amplitude measurement what is the delta increase you look for how quickly the delta change must occur   Although I said you re not interested in frequency  you might want to do some filtering first  to filter out especially low and high frequency components  That might help you avoid some  false positives   You could do that with an FIR or IIR digital filter  Fourier isn t necessary

[python] How get sound input from microphone in python, and process it on the fly?

Examples related to python

Examples related to microphone