[python] Peak-finding algorithm for Python/SciPy

I can write something myself by finding zero-crossings of the first derivative or something, but it seems like a common-enough function to be included in standard libraries. Anyone know of one?

My particular application is a 2D array, but usually it would be used for finding peaks in FFTs, etc.

Specifically, in these kinds of problems, there are multiple strong peaks, and then lots of smaller "peaks" that are just caused by noise that should be ignored. These are just examples; not my actual data:

1-dimensional peaks:

FFT output with peaks

2-dimensional peaks:

Radon transform output with circled peak

The peak-finding algorithm would find the location of these peaks (not just their values), and ideally would find the true inter-sample peak, not just the index with maximum value, probably using quadratic interpolation or something.

Typically you only care about a few strong peaks, so they'd either be chosen because they're above a certain threshold, or because they're the first n peaks of an ordered list, ranked by amplitude.

As I said, I know how to write something like this myself. I'm just asking if there's a pre-existing function or package that's known to work well.

Update:

I translated a MATLAB script and it works decently for the 1-D case, but could be better.

Updated update:

sixtenbe created a better version for the 1-D case.

This question is related to python scipy fft hough-transform

The answer is


Detecting peaks in a spectrum in a reliable way has been studied quite a bit, for example all the work on sinusoidal modelling for music/audio signals in the 80ies. Look for "Sinusoidal Modeling" in the literature.

If your signals are as clean as the example, a simple "give me something with an amplitude higher than N neighbours" should work reasonably well. If you have noisy signals, a simple but effective way is to look at your peaks in time, to track them: you then detect spectral lines instead of spectral peaks. IOW, you compute the FFT on a sliding window of your signal, to get a set of spectrum in time (also called spectrogram). You then look at the evolution of the spectral peak in time (i.e. in consecutive windows).


There are standard statistical functions and methods for finding outliers to data, which is probably what you need in the first case. Using derivatives would solve your second. I'm not sure for a method which solves both continuous functions and sampled data, however.


For those not sure about which peak-finding algorithms to use in Python, here a rapid overview of the alternatives: https://github.com/MonsieurV/py-findpeaks

Wanting myself an equivalent to the MatLab findpeaks function, I've found that the detect_peaks function from Marcos Duarte is a good catch.

Pretty easy to use:

import numpy as np
from vector import vector, plot_peaks
from libs import detect_peaks
print('Detect peaks with minimum height and distance filters.')
indexes = detect_peaks.detect_peaks(vector, mph=7, mpd=2)
print('Peaks are: %s' % (indexes))

Which will give you:

detect_peaks results


I do not think that what you are looking for is provided by SciPy. I would write the code myself, in this situation.

The spline interpolation and smoothing from scipy.interpolate are quite nice and might be quite helpful in fitting peaks and then finding the location of their maximum.


I'm looking at a similar problem, and I've found some of the best references come from chemistry (from peaks finding in mass-spec data). For a good thorough review of peaking finding algorithms read this. This is one of the best clearest reviews of peak finding techniques that I've run across. (Wavelets are the best for finding peaks of this sort in noisy data.).

It looks like your peaks are clearly defined and aren't hidden in the noise. That being the case I'd recommend using smooth savtizky-golay derivatives to find the peaks (If you just differentiate the data above you'll have a mess of false positives.). This is a very effective technique and is pretty easy to implemented (you do need a matrix class w/ basic operations). If you simply find the zero crossing of the first S-G derivative I think you'll be happy.


First things first, the definition of "peak" is vague if without further specifications. For example, for the following series, would you call 5-4-5 one peak or two?

1-2-1-2-1-1-5-4-5-1-1-5-1

In this case, you'll need at least two thresholds: 1) a high threshold only above which can an extreme value register as a peak; and 2) a low threshold so that extreme values separated by small values below it will become two peaks.

Peak detection is a well-studied topic in Extreme Value Theory literature, also known as "declustering of extreme values". Its typical applications include identifying hazard events based on continuous readings of environmental variables e.g. analysing wind speed to detect storm events.


There is a function in scipy named scipy.signal.find_peaks_cwt which sounds like is suitable for your needs, however I don't have experience with it so I cannot recommend..

http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks_cwt.html


To detect both positive and negative peaks, PeakDetect is helpful.

from peakdetect import peakdetect

peaks = peakdetect(data, lookahead=20) 
# Lookahead is the distance to look ahead from a peak to determine if it is the actual peak. 
# Change lookahead as necessary 
higherPeaks = np.array(peaks[0])
lowerPeaks = np.array(peaks[1])
plt.plot(data)
plt.plot(higherPeaks[:,0], higherPeaks[:,1], 'ro')
plt.plot(lowerPeaks[:,0], lowerPeaks[:,1], 'ko')

PeakDetection


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to scipy

Reading images in python Numpy Resize/Rescale Image How to get the indices list of all NaN value in numpy array? ImportError: cannot import name NUMPY_MKL numpy.where() detailed, step-by-step explanation / examples Scikit-learn train_test_split with indices Matplotlib: Specify format of floats for tick labels Installing NumPy and SciPy on 64-bit Windows (with Pip) Can't install Scipy through pip Plotting a fast Fourier transform in Python

Examples related to fft

only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices Plotting a fast Fourier transform in Python How to plot a 2D FFT in Matlab? Understanding Matlab FFT example FFT in a single C-file How do I obtain the frequencies of each value in an FFT? Reliable and fast FFT in Java Peak-finding algorithm for Python/SciPy An implementation of the fast Fourier transform (FFT) in C#

Examples related to hough-transform

Peak-finding algorithm for Python/SciPy