[python] Reading *.wav files in Python

I need to analyze sound written in a .wav file. For that I need to transform this file into set of numbers (arrays, for example). I think I need to use the wave package. However, I do not know how exactly it works. For example I did the following:

import wave
w = wave.open('/usr/share/sounds/ekiga/voicemail.wav', 'r')
for i in range(w.getnframes()):
    frame = w.readframes(i)
    print frame

As a result of this code I expected to see sound pressure as function of time. In contrast I see a lot of strange, mysterious symbols (which are not hexadecimal numbers). Can anybody, pleas, help me with that?

This question is related to python audio wav wave

The answer is


IMHO, the easiest way to get audio data from a sound file into a NumPy array is SoundFile:

import soundfile as sf
data, fs = sf.read('/usr/share/sounds/ekiga/voicemail.wav')

This also supports 24-bit files out of the box.

There are many sound file libraries available, I've written an overview where you can see a few pros and cons. It also features a page explaining how to read a 24-bit wav file with the wave module.


Different Python modules to read wav:

There is at least these following libraries to read wave audio files:

The most simple example:

This is a simple example with SoundFile:

import soundfile as sf
data, samplerate = sf.read('existing_file.wav') 

Format of the output:

Warning, the data are not always in the same format, that depends on the library. For instance:

from scikits import audiolab
from scipy.io import wavfile
from sys import argv
for filepath in argv[1:]:
    x, fs, nb_bits = audiolab.wavread(filepath)
    print('Reading with scikits.audiolab.wavread:', x)
    fs, x = wavfile.read(filepath)
    print('Reading with scipy.io.wavfile.read:', x)

Output:

Reading with scikits.audiolab.wavread: [ 0.          0.          0.         ..., -0.00097656 -0.00079346 -0.00097656]
Reading with scipy.io.wavfile.read: [  0   0   0 ..., -32 -26 -32]

SoundFile and Audiolab return floats between -1 and 1 (as matab does, that is the convention for audio signals). Scipy and wave return integers, which you can convert to floats according to the number of bits of encoding, for example:

from scipy.io.wavfile import read as wavread
samplerate, x = wavread(audiofilename)  # x is a numpy array of integers, representing the samples 
# scale to -1.0 -- 1.0
if x.dtype == 'int16':
    nb_bits = 16  # -> 16-bit wav files
elif x.dtype == 'int32':
    nb_bits = 32  # -> 32-bit wav files
max_nb_bit = float(2 ** (nb_bits - 1))
samples = x / (max_nb_bit + 1)  # samples is a numpy array of floats representing the samples 

I needed to read a 1-channel 24-bit WAV file. The post above by Nak was very useful. However, as mentioned above by basj 24-bit is not straightforward. I finally got it working using the following snippet:

from scipy.io import wavfile
TheFile = 'example24bit1channelFile.wav'
[fs, x] = wavfile.read(TheFile)

# convert the loaded data into a 24bit signal

nx = len(x)
ny = nx/3*4    # four 3-byte samples are contained in three int32 words

y = np.zeros((ny,), dtype=np.int32)    # initialise array

# build the data left aligned in order to keep the sign bit operational.
# result will be factor 256 too high

y[0:ny:4] = ((x[0:nx:3] & 0x000000FF) << 8) | \
  ((x[0:nx:3] & 0x0000FF00) << 8) | ((x[0:nx:3] & 0x00FF0000) << 8)
y[1:ny:4] = ((x[0:nx:3] & 0xFF000000) >> 16) | \
  ((x[1:nx:3] & 0x000000FF) << 16) | ((x[1:nx:3] & 0x0000FF00) << 16)
y[2:ny:4] = ((x[1:nx:3] & 0x00FF0000) >> 8) | \
  ((x[1:nx:3] & 0xFF000000) >> 8) | ((x[2:nx:3] & 0x000000FF) << 24)
y[3:ny:4] = (x[2:nx:3] & 0x0000FF00) | \
  (x[2:nx:3] & 0x00FF0000) | (x[2:nx:3] & 0xFF000000)

y = y/256   # correct for building 24 bit data left aligned in 32bit words

Some additional scaling is required if you need results between -1 and +1. Maybe some of you out there might find this useful


u can also use simple import wavio library u also need have some basic knowledge of the sound.


Using the struct module, you can take the wave frames (which are in 2's complementary binary between -32768 and 32767 (i.e. 0x8000 and 0x7FFF). This reads a MONO, 16-BIT, WAVE file. I found this webpage quite useful in formulating this:

import wave, struct

wavefile = wave.open('sine.wav', 'r')

length = wavefile.getnframes()
for i in range(0, length):
    wavedata = wavefile.readframes(1)
    data = struct.unpack("<h", wavedata)
    print(int(data[0]))

This snippet reads 1 frame. To read more than one frame (e.g., 13), use

wavedata = wavefile.readframes(13)
data = struct.unpack("<13h", wavedata)

if its just two files and the sample rate is significantly high, you could just interleave them.

from scipy.io import wavfile
rate1,dat1 = wavfile.read(File1)
rate2,dat2 = wavfile.read(File2)

if len(dat2) > len(dat1):#swap shortest
    temp = dat2
    dat2 = dat1
    dat1 = temp

output = dat1
for i in range(len(dat2)/2): output[i*2]=dat2[i*2]

wavfile.write(OUTPUT,rate,dat)

If you want to procces an audio block by block, some of the given solutions are quite awful in the sense that they imply loading the whole audio into memory producing many cache misses and slowing down your program. python-wavefile provides some pythonic constructs to do NumPy block-by-block processing using efficient and transparent block management by means of generators. Other pythonic niceties are context manager for files, metadata as properties... and if you want the whole file interface, because you are developing a quick prototype and you don't care about efficency, the whole file interface is still there.

A simple example of processing would be:

import sys
from wavefile import WaveReader, WaveWriter

with WaveReader(sys.argv[1]) as r :
    with WaveWriter(
            'output.wav',
            channels=r.channels,
            samplerate=r.samplerate,
            ) as w :

        # Just to set the metadata
        w.metadata.title = r.metadata.title + " II"
        w.metadata.artist = r.metadata.artist

        # This is the prodessing loop
        for data in r.read_iter(size=512) :
            data[1] *= .8     # lower volume on the second channel
            w.write(data)

The example reuses the same block to read the whole file, even in the case of the last block that usually is less than the required size. In this case you get an slice of the block. So trust the returned block length instead of using a hardcoded 512 size for any further processing.


Here's a Python 3 solution using the built in wave module [1], that works for n channels, and 8,16,24... bits.

import sys
import wave

def read_wav(path):
    with wave.open(path, "rb") as wav:
        nchannels, sampwidth, framerate, nframes, _, _ = wav.getparams()
        print(wav.getparams(), "\nBits per sample =", sampwidth * 8)

        signed = sampwidth > 1  # 8 bit wavs are unsigned
        byteorder = sys.byteorder  # wave module uses sys.byteorder for bytes

        values = []  # e.g. for stereo, values[i] = [left_val, right_val]
        for _ in range(nframes):
            frame = wav.readframes(1)  # read next frame
            channel_vals = []  # mono has 1 channel, stereo 2, etc.
            for channel in range(nchannels):
                as_bytes = frame[channel * sampwidth: (channel + 1) * sampwidth]
                as_int = int.from_bytes(as_bytes, byteorder, signed=signed)
                channel_vals.append(as_int)
            values.append(channel_vals)

    return values, framerate

You can turn the result into a NumPy array.

import numpy as np

data, rate = read_wav(path)
data = np.array(data)

Note, I've tried to make it readable rather than fast. I found reading all the data at once was almost 2x faster. E.g.

with wave.open(path, "rb") as wav:
    nchannels, sampwidth, framerate, nframes, _, _ = wav.getparams()
    all_bytes = wav.readframes(-1)

framewidth = sampwidth * nchannels
frames = (all_bytes[i * framewidth: (i + 1) * framewidth]
            for i in range(nframes))

for frame in frames:
    ...

Although python-soundfile is roughly 2 orders of magnitude faster (hard to approach this speed with pure CPython).

[1] https://docs.python.org/3/library/wave.html


If you're going to perform transfers on the waveform data then perhaps you should use SciPy, specifically scipy.io.wavfile.


PyDub (http://pydub.com/) has not been mentioned and that should be fixed. IMO this is the most comprehensive library for reading audio files in Python right now, although not without its faults. Reading a wav file:

from pydub import AudioSegment

audio_file = AudioSegment.from_wav('path_to.wav')
# or
audio_file = AudioSegment.from_file('path_to.wav')

# do whatever you want with the audio, change bitrate, export, convert, read info, etc.
# Check out the API docs http://pydub.com/

PS. The example is about reading a wav file, but PyDub can handle a lot of various formats out of the box. The caveat is that it's based on both native Python wav support and ffmpeg, so you have to have ffmpeg installed and a lot of the pydub capabilities rely on the ffmpeg version. Usually if ffmpeg can do it, so can pydub (which is quite powerful).

Non-disclaimer: I'm not related to the project, but I am a heavy user.


You can accomplish this using the scikits.audiolab module. It requires NumPy and SciPy to function, and also libsndfile.

Note, I was only able to get it to work on Ubunutu and not on OSX.

from scikits.audiolab import wavread

filename = "testfile.wav"

data, sample_frequency,encoding = wavread(filename)

Now you have the wav data


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to audio

How to prevent "The play() request was interrupted by a call to pause()" error? Creating and playing a sound in swift How to play or open *.mp3 or *.wav sound file in c++ program? How to playback MKV video in web browser? Play audio as microphone input HTML embed autoplay="false", but still plays automatically Autoplay an audio with HTML5 embed tag while the player is invisible Playing mp3 song on python Javascript Audio Play on click Play sound on button click android

Examples related to wav

How to handle Uncaught (in promise) DOMException: The play() request was interrupted by a call to pause() Playing .mp3 and .wav in Java? How to play .wav files with java Reading *.wav files in Python Detect & Record Audio in Python

Examples related to wave

Reading *.wav files in Python