Numpy first occurrence of value greater than existing value

Question

I have a 1D array in numpy and I want to find the position of the index where a value exceeds the value in numpy array   E g   aa   range -10 10    Find position in aa where  the value 5 gets exceeded

User · Answer

I would go with  i   np min np where V  gt   x     where V is vector  1d array   x is the value and i is the resulting index

User · Answer

In  34   a np arange -10 10   In  35   a Out 35   array  -10   -9   -8   -7   -6   -5   -4   -3   -2   -1    0    1    2           3    4    5    6    7    8    9    In  36   np where a gt 5  Out 36    array  16  17  18  19      In  37   np where a gt 5  0  0  Out 37   16

User · Answer

given the sorted content of your array  there is an even faster method  searchsorted   import time N   10000 aa   np arange -N N   timeit np searchsorted aa  N 2  1  timeit np argmax aa gt N 2   timeit np where aa gt N 2  0  0   timeit np nonzero aa gt N 2  0  0     Output 100000 loops  best of 3  5 97   s per loop 10000 loops  best of 3  46 3   s per loop 10000 loops  best of 3  154   s per loop 10000 loops  best of 3  154   s per loop

User · Answer

I was also interested in this and I ve compared all the suggested answers with perfplot   Disclaimer  I m the author of perfplot   If you know that the array you re looking through is already sorted  then numpy searchsorted a  alpha   is for you  It s O log n   operation  i e   the speed hardly depends on the size of the array  You can t get faster than that  If you don t know anything about your array  you re not going wrong with numpy argmax a  gt  alpha   Already sorted   Unsorted   Code to reproduce the plot  import numpy import perfplot   alpha   0 5 numpy random seed 0    def argmax data       return numpy argmax data  gt  alpha    def where data       return numpy where data  gt  alpha  0  0    def nonzero data       return numpy nonzero data  gt  alpha  0  0    def searchsorted data       return numpy searchsorted data  alpha    perfplot save       quot out png quot         setup numpy random rand      setup lambda n  numpy sort numpy random rand n        kernels  argmax  where  nonzero  searchsorted       n range  2    k for k in range 2  23        xlabel  quot len array  quot

User · Answer

I d like to propose   np min np append np where aa gt 5  0  np inf     This will return the smallest index where the condition is met  while returning infinity if the condition is never met  and where returns an empty array

User · Answer

Arrays that have a constant step between elements  In case of a range or any other linearly increasing array you can simply calculate the index programmatically  no need to actually iterate over the array at all   def first index calculate range like val  arr       if len arr     0          raise ValueError  no value greater than     format val       elif len arr     1          if arr 0   gt  val              return 0         else              raise ValueError  no value greater than     format val        first value   arr 0      step   arr 1  - first value       For linearly decreasing arrays or constant arrays we only need to check       the first element  because if that does not satisfy the condition       no other element will      if step  lt   0          if first value  gt  val              return 0         else              raise ValueError  no value greater than     format val        calculated position    val - first value    step      if calculated position  lt  0          return 0     elif calculated position  gt  len arr  - 1          raise ValueError  no value greater than     format val        return int calculated position    1   One could probably improve that a bit  I have made sure it works correctly for a few sample arrays and values but that doesn t mean there couldn t be mistakes in there  especially considering that it uses floats      gt  gt  gt  import numpy as np  gt  gt  gt  first index calculate range like 5  np arange -10  10   16  gt  gt  gt  np arange -10  10  16     double check 6   gt  gt  gt  first index calculate range like 4 8  np arange -10  10   15   Given that it can calculate the position without any iteration it will be constant time  O 1   and can probably beat all other mentioned approaches  However it requires a constant step in the array  otherwise it will produce wrong results   General solution using numba  A more general approach would be using a numba function    nb njit def first index numba val  arr       for idx in range len arr            if arr idx   gt  val              return idx     return -1   That will work for any array but it has to iterate over the array  so in the average case it will be O n     gt  gt  gt  first index numba 4 8  np arange -10  10   15  gt  gt  gt  first index numba 5  np arange -10  10   16   Benchmark  Even though Nico Schl  mer already provided some benchmarks I thought it might be useful to include my new solutions and to test for different  values    The test setup   import numpy as np import math import numba as nb  def first index using argmax val  arr       return np argmax arr  gt  val   def first index using where val  arr       return np where arr  gt  val  0  0   def first index using nonzero val  arr       return np nonzero arr  gt  val  0  0   def first index using searchsorted val  arr       return np searchsorted arr  val    1  def first index using min val  arr       return np min np where arr  gt  val    def first index calculate range like val  arr       if len arr     0          raise ValueError  empty array       elif len arr     1          if arr 0   gt  val              return 0         else              raise ValueError  no value greater than     format val        first value   arr 0      step   arr 1  - first value     if step  lt   0          if first value  gt  val              return 0         else              raise ValueError  no value greater than     format val        calculated position    val - first value    step      if calculated position  lt  0          return 0     elif calculated position  gt  len arr  - 1          raise ValueError  no value greater than     format val        return int calculated position    1   nb njit def first index numba val  arr       for idx in range len arr            if arr idx   gt  val              return idx     return -1  funcs         first index using argmax       first index using min       first index using nonzero      first index calculate range like       first index numba       first index using searchsorted       first index using where    from simple benchmark import benchmark  MultiArgument   and the plots were generated using    matplotlib notebook b plot     item is at the beginning  b   benchmark      funcs       2  i  MultiArgument  0  np arange 2  i    for i in range 2  20        argument name  array size       The numba function performs best followed by the calculate-function and the searchsorted function  The other solutions perform much worse   item is at the end  b   benchmark      funcs       2  i  MultiArgument  2  i-2  np arange 2  i    for i in range 2  20        argument name  array size       For small arrays the numba function performs amazingly fast  however for bigger arrays it s outperformed by the calculate-function and the searchsorted function   item is at sqrt len   b   benchmark      funcs       2  i  MultiArgument  np sqrt 2  i   np arange 2  i    for i in range 2  20        argument name  array size       This is more interesting  Again numba and the calculate function perform great  however this is actually triggering the worst case of searchsorted which really doesn t work well in this case   Comparison of the functions when no value satisfies the condition  Another interesting point is how these function behave if there is no value whose index should be returned   arr   np ones 100  value   2  for func in funcs      print func   name        try          print  -- gt    func value  arr       except Exception as e          print  -- gt    e    With this result   first index using argmax -- gt  0 first index using min -- gt  zero-size array to reduction operation minimum which has no identity first index using nonzero -- gt  index 0 is out of bounds for axis 0 with size 0 first index calculate range like -- gt  no value greater than 2 first index numba -- gt  -1 first index using searchsorted -- gt  101 first index using where -- gt  index 0 is out of bounds for axis 0 with size 0   Searchsorted  argmax  and numba simply return a wrong value  However searchsorted and numba return an index that is not a valid index for the array   The functions where  min  nonzero and calculate throw an exception  However only the exception for calculate actually says anything helpful   That means one actually has to wrap these calls in an appropriate wrapper function that catches exceptions or invalid return values and handle appropriately  at least if you aren t sure if the value could be in the array     Note  The calculate and searchsorted options only work in special conditions  The  calculate  function requires a constant step and the searchsorted requires the array to be sorted  So these could be useful in the right circumstances but aren t general solutions for this problem  In case you re dealing with sorted Python lists you might want to take a look at the bisect module instead of using Numpys searchsorted

User · Answer

This is a little faster  and looks nicer   np argmax aa gt 5    Since argmax will stop at the first True   In case of multiple occurrences of the maximum values  the indices corresponding to the first occurrence are returned    and doesn t save another list   In  2   N   10000  In  3   aa   np arange -N N   In  4   timeit np argmax aa gt N 2  100000 loops  best of 3  52 3 us per loop  In  5   timeit np where aa gt N 2  0  0  10000 loops  best of 3  141 us per loop  In  6   timeit np nonzero aa gt N 2  0  0  10000 loops  best of 3  142 us per loop

[python] Numpy first occurrence of value greater than existing value

Examples related to python

Examples related to numpy