What s the fastest way of checking if a point is inside a polygon in python

Question

I found two main methods to look if a point belongs inside a polygon  One is using the ray tracing method used here  which is the most recommended answer  the other is using matplotlib path contains points  which seems a bit obscure to me   I will have to check lots of points continuously  Does anybody know if any of these two is more recommendable than the other or if there are even better third options  UPDATE  I checked the two methods and matplotlib looks much faster  from time import time import numpy as np import matplotlib path as mpltPath    regular polygon for testing lenpoly   100 polygon     np sin x  0 5 np cos x  0 5  for x in np linspace 0 2 np pi lenpoly   -1      random points set of points to test  N   10000 points   np random rand N 2      Ray tracing def ray tracing method x y poly        n   len poly      inside   False      p1x p1y   poly 0      for i in range n 1           p2x p2y   poly i   n          if y  gt  min p1y p2y               if y  lt   max p1y p2y                   if x  lt   max p1x p2x                       if p1y    p2y                          xints    y-p1y   p2x-p1x   p2y-p1y  p1x                     if p1x    p2x or x  lt   xints                          inside   not inside         p1x p1y   p2x p2y      return inside  start time   time   inside1    ray tracing method point 0   point 1   polygon  for point in points  print  quot Ray Tracing Elapsed time   quot    str time  -start time      Matplotlib mplPath start time   time   path   mpltPath Path polygon  inside2   path contains points points  print  quot Matplotlib contains points Elapsed time   quot    str time  -start time    which gives  Ray Tracing Elapsed time  0 441395998001 Matplotlib contains points Elapsed time  0 00994491577148  Same relative difference was obtained one using a triangle instead of the 100 sides polygon  I will also check shapely since it looks a package just devoted to these kind of problems

User · Answer

If speed is what you need and extra dependencies are not a problem, you maybe find numba quite useful (now it is pretty easy to install, on any platform). The classic ray_tracing approach you proposed can be easily ported to numba by using numba @jit decorator and casting the polygon to a numpy array. The code should look like:

@jit(nopython=True)
def ray_tracing(x,y,poly):
    n = len(poly)
    inside = False
    p2x = 0.0
    p2y = 0.0
    xints = 0.0
    p1x,p1y = poly[0]
    for i in range(n+1):
        p2x,p2y = poly[i % n]
        if y > min(p1y,p2y):
            if y <= max(p1y,p2y):
                if x <= max(p1x,p2x):
                    if p1y != p2y:
                        xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
                    if p1x == p2x or x <= xints:
                        inside = not inside
        p1x,p1y = p2x,p2y

    return inside

The first execution will take a little longer than any subsequent call:

%%time
polygon=np.array(polygon)
inside1 = [numba_ray_tracing_method(point[0], point[1], polygon) for 
point in points]

CPU times: user 129 ms, sys: 4.08 ms, total: 133 ms
Wall time: 132 ms

Which, after compilation will decrease to:

CPU times: user 18.7 ms, sys: 320 µs, total: 19.1 ms
Wall time: 18.4 ms

If you need speed at the first call of the function you can then pre-compile the code in a module using pycc. Store the function in a src.py like:

from numba import jit
from numba.pycc import CC
cc = CC('nbspatial')


@cc.export('ray_tracing',  'b1(f8, f8, f8[:,:])')
@jit(nopython=True)
def ray_tracing(x,y,poly):
    n = len(poly)
    inside = False
    p2x = 0.0
    p2y = 0.0
    xints = 0.0
    p1x,p1y = poly[0]
    for i in range(n+1):
        p2x,p2y = poly[i % n]
        if y > min(p1y,p2y):
            if y <= max(p1y,p2y):
                if x <= max(p1x,p2x):
                    if p1y != p2y:
                        xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
                    if p1x == p2x or x <= xints:
                        inside = not inside
        p1x,p1y = p2x,p2y

    return inside


if __name__ == "__main__":
    cc.compile()

Build it with python src.py and run:

import nbspatial

import numpy as np
lenpoly = 100
polygon = [[np.sin(x)+0.5,np.cos(x)+0.5] for x in 
np.linspace(0,2*np.pi,lenpoly)[:-1]]

# random points set of points to test 
N = 10000
# making a list instead of a generator to help debug
points = zip(np.random.random(N),np.random.random(N))

polygon = np.array(polygon)

%%time
result = [nbspatial.ray_tracing(point[0], point[1], polygon) for point in points]

CPU times: user 20.7 ms, sys: 64 µs, total: 20.8 ms
Wall time: 19.9 ms

In the numba code I used: 'b1(f8, f8, f8[:,:])'

In order to compile with nopython=True, each var needs to be declared before the for loop.

In the prebuild src code the line:

@cc.export('ray_tracing' , 'b1(f8, f8, f8[:,:])')

Is used to declare the function name and its I/O var types, a boolean output b1 and two floats f8 and a two-dimensional array of floats f8[:,:] as input.

Edit Jan/4/2021

For my use case, I need to check if multiple points are inside a single polygon - In such a context, it is useful to take advantage of numba parallel capabilities to loop over a series of points. The example above can be changed to:

from numba import jit, njit
import numba
import numpy as np 

@jit(nopython=True)
def pointinpolygon(x,y,poly):
    n = len(poly)
    inside = False
    p2x = 0.0
    p2y = 0.0
    xints = 0.0
    p1x,p1y = poly[0]
    for i in numba.prange(n+1):
        p2x,p2y = poly[i % n]
        if y > min(p1y,p2y):
            if y <= max(p1y,p2y):
                if x <= max(p1x,p2x):
                    if p1y != p2y:
                        xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
                    if p1x == p2x or x <= xints:
                        inside = not inside
        p1x,p1y = p2x,p2y

    return inside


@njit(parallel=True)
def parallelpointinpolygon(points, polygon):
    D = np.empty(len(points), dtype=numba.boolean) 
    for i in numba.prange(0, len(D)):
        D[i] = pointinpolygon(points[i,0], points[i,1], polygon)
    return D

Note: pre-compiling the above code will not enable the parallel capabilities of numba (parallel CPU target is not supported by pycc/AOT compilation) see: https://github.com/numba/numba/issues/3336

Test:


import numpy as np
lenpoly = 100
polygon = [[np.sin(x)+0.5,np.cos(x)+0.5] for x in np.linspace(0,2*np.pi,lenpoly)[:-1]]
polygon = np.array(polygon)
N = 10000
points = np.random.uniform(-1.5, 1.5, size=(N, 2))

For N=10000 on a 72 core machine, returns:

%%timeit
parallelpointinpolygon(points, polygon)
# 480 µs ± 8.19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Edit 17 Feb '21:

fixing loop to start from 0 instead of 1 (thanks @mehdi):

for i in numba.prange(0, len(D))

Edit 20 Feb '21:

Follow-up on the comparison made by @mehdi, I am adding a GPU-based method below. It uses the point_in_polygon method, from the cuspatial library:

    import numpy as np
    import cudf
    import cuspatial

    N = 100000002
    lenpoly = 1000
    polygon = [[np.sin(x)+0.5,np.cos(x)+0.5] for x in 
    np.linspace(0,2*np.pi,lenpoly)]
    polygon = np.array(polygon)
    points = np.random.uniform(-1.5, 1.5, size=(N, 2))


    x_pnt = points[:,0]
    y_pnt = points[:,1]
    x_poly =polygon[:,0]
    y_poly = polygon[:,1]
    result = cuspatial.point_in_polygon(
        x_pnt,
        y_pnt,
        cudf.Series([0], index=['geom']),
        cudf.Series([0], name='r_pos', dtype='int32'), 
        x_poly, 
        y_poly,
    )

Following @Mehdi comparison. For N=100000002 and lenpoly=1000 - I got the following results:

 time_parallelpointinpolygon:         161.54760098457336 
 time_mpltPath:                       307.1664695739746 
 time_ray_tracing_numpy_numba:        353.07356882095337 
 time_is_inside_sm_parallel:          37.45389246940613 
 time_is_inside_postgis_parallel:     127.13793849945068 
 time_is_inside_rapids:               4.246025562286377

hardware specs:

CPU Intel xeon E1240
GPU Nvidia GTX 1070

Notes:

The cuspatial.point_in_poligon method, is quite robust and powerful, it offers the ability to work with multiple and complex polygons (I guess at the expense of performance)
The numba methods can also be 'ported' on the GPU - it will be interesting to see a comparison which includes a porting to cuda of fastest method mentioned by @Mehdi (is_inside_sm).

User · Answer

Comparison of different methods I found other methods to check if a point is inside a polygon  here   I tested two of them only  is inside sm and is inside postgis  and the results were the same as the other methods  Thanks to  epifanio  I parallelized the codes and compared them with  epifanio and  user3274748  ray tracing numpy  methods  Note that both methods had a bug so I fixed them as shown in their codes below  One more thing that I found is that the code provided for creating a polygon does not generate a closed path np linspace 0 2 np pi lenpoly   -1   As a result  the codes provided in above GitHub repository may not work properly  So It s better to create a closed path  first and last points should be the same   Codes Method 1  parallelpointinpolygon from numba import jit  njit import numba import numpy as np    jit nopython True  def pointinpolygon x y poly       n   len poly      inside   False     p2x   0 0     p2y   0 0     xints   0 0     p1x p1y   poly 0      for i in numba prange n 1           p2x p2y   poly i   n          if y  gt  min p1y p2y               if y  lt   max p1y p2y                   if x  lt   max p1x p2x                       if p1y    p2y                          xints    y-p1y   p2x-p1x   p2y-p1y  p1x                     if p1x    p2x or x  lt   xints                          inside   not inside         p1x p1y   p2x p2y      return inside    njit parallel True  def parallelpointinpolygon points  polygon       D   np empty len points   dtype numba boolean       for i in numba prange 0  len D        lt -- Fixed here  must start from zero         D i    pointinpolygon points i 0   points i 1   polygon      return D     Method 2  ray tracing numpy numba  jit nopython True  def ray tracing numpy numba points poly       x y   points   0   points   1      n   len poly      inside   np zeros len x  np bool       p2x   0 0     p2y   0 0     p1x p1y   poly 0      for i in range n 1           p2x p2y   poly i   n          idx   np nonzero  y  gt  min p1y p2y    amp   y  lt   max p1y p2y    amp   x  lt   max p1x p2x    0          if len idx         lt -- Fixed here  If idx is null skip comparisons below              if p1y    p2y                  xints    y idx -p1y   p2x-p1x   p2y-p1y  p1x             if p1x    p2x                  inside idx     inside idx              else                  idxx   idx x idx   lt   xints                  inside idxx     inside idxx               p1x p1y   p2x p2y     return inside   Method 3  Matplotlib contains points path   mpltPath Path polygon closed True      lt -- Very important to mention that the path                                                   is closed  default is false   Method 4  is inside sm  got it from here   jit nopython True  def is inside sm polygon  point       length   len polygon -1     dy2   point 1  - polygon 0  1      intersections   0     ii   0     jj   1      while ii lt length          dy    dy2         dy2   point 1  - polygon jj  1             consider only lines which are not completely above bellow right from the point         if dy dy2  lt   0 0 and  point 0   gt   polygon ii  0  or point 0   gt   polygon jj  0                   non-horizontal line             if dy lt 0 or dy2 lt 0                  F   dy  polygon jj  0  - polygon ii  0    dy-dy2    polygon ii  0                   if point 0   gt  F    if line is left from the point - the ray moving towards left  will intersect it                     intersections    1                 elif point 0     F    point on line                     return 2                point on upper peak  dy2 dx2 0  or horizontal line  dy dy2 0 and dx dx2 lt  0              elif dy2  0 and  point 0   polygon jj  0  or  dy  0 and  point 0 -polygon ii  0    point 0 -polygon jj  0   lt  0                    return 2          ii   jj         jj    1       print  intersections     intersections     return intersections  amp  1      njit parallel True  def is inside sm parallel points  polygon       ln   len points      D   np empty ln  dtype numba boolean       for i in numba prange ln           D i    is inside sm polygon points i       return D    Method 5  is inside postgis  got it from here   jit nopython True  def is inside postgis polygon  point       length   len polygon      intersections   0      dx2   point 0  - polygon 0  0      dy2   point 1  - polygon 0  1      ii   0     jj   1      while jj lt length          dx    dx2         dy    dy2         dx2   point 0  - polygon jj  0          dy2   point 1  - polygon jj  1           F   dx-dx2  dy - dx  dy-dy2           if 0 0  F and dx dx2 lt  0 and dy dy2 lt  0              return 2           if  dy gt  0 and dy2 lt 0  or  dy2 gt  0 and dy lt 0               if F  gt  0                  intersections    1             elif F  lt  0                  intersections -  1          ii   jj         jj    1       print  intersections     intersections     return intersections    0      njit parallel True  def is inside postgis parallel points  polygon       ln   len points      D   np empty ln  dtype numba boolean       for i in numba prange ln           D i    is inside postgis polygon points i       return D    Benchmark  Timing for 10 million points  parallelpointinpolygon Elapsed time       4 0122294425964355 Matplotlib contains points Elapsed time  14 117807388305664 ray tracing numpy numba Elapsed time      7 908452272415161 sm parallel Elapsed time                  0 7710440158843994 is inside postgis parallel Elapsed time   2 131121873855591  Here is the code  import matplotlib pyplot as plt import matplotlib path as mpltPath from time import time import numpy as np  np random seed 2   time parallelpointinpolygon    time mpltPath    time ray tracing numpy numba    time is inside sm parallel    time is inside postgis parallel    n points     for i in range 1  10000002  1000000        n points append i           lenpoly   100     polygon     np sin x  0 5 np cos x  0 5  for x in np linspace 0 2 np pi lenpoly       polygon   np array polygon      N   i     points   np random uniform -1 5  1 5  size  N  2                  Method 1     start time   time       inside1 parallelpointinpolygon points  polygon      time parallelpointinpolygon append time  -start time         Method 2     start time   time       path   mpltPath Path polygon closed True      inside2   path contains points points      time mpltPath append time  -start time         Method 3     start time   time       inside3 ray tracing numpy numba points polygon      time ray tracing numpy numba append time  -start time         Method 4     start time   time       inside4 is inside sm parallel points polygon      time is inside sm parallel append time  -start time         Method 5     start time   time       inside5 is inside postgis parallel points polygon      time is inside postgis parallel append time  -start time         plt plot n points time parallelpointinpolygon label  parallelpointinpolygon   plt plot n points time mpltPath label  mpltPath   plt plot n points time ray tracing numpy numba label  ray tracing numpy numba   plt plot n points time is inside sm parallel label  is inside sm parallel   plt plot n points time is inside postgis parallel label  is inside postgis parallel   plt xlabel  quot N points quot   plt ylabel  quot time  sec  quot   plt legend loc    best   plt show     CONCLUSION The fastest algorithms are  1- is inside sm parallel 2- is inside postgis parallel 3- parallelpointinpolygon   epifanio

User · Answer

You can consider shapely   from shapely geometry import Point from shapely geometry polygon import Polygon  point   Point 0 5  0 5  polygon   Polygon   0  0    0  1    1  1    1  0    print polygon contains point     From the methods you ve mentioned I ve only used the second  path contains points  and it works fine  In any case depending on the precision you need for your test I would suggest creating a numpy bool grid with all nodes inside the polygon to be True  False if not   If you are going to make a test for a lot of points this might be faster  although notice this relies you are making a test within a  pixel  tolerance    from matplotlib import path import matplotlib pyplot as plt import numpy as np  first   -3 size     3-first  100 xv yv   np meshgrid np linspace -3 3 100  np linspace -3 3 100   p   path Path   0 0    0  1    1  1    1  0       square with legs length 1 and bottom left corner at the origin flags   p contains points np hstack  xv flatten     np newaxis  yv flatten     np newaxis     grid   np zeros  101 101  dtype  bool   grid   xv flatten  -first  size  astype  int     yv flatten  -first  size  astype  int      flags  xi yi   np random randint -300 300 100  100 np random randint -300 300 100  100 vflag   grid   xi-first  size  astype  int     yi-first  size  astype  int    plt imshow grid T origin  lower  interpolation  nearest  cmap  binary   plt scatter   xi-first  size  astype  int     yi-first  size  astype  int   c vflag cmap  Greens  s 90  plt show       the results is this

User · Answer

I will just leave it here  just rewrote the code above using numpy  maybe somebody finds it useful   def ray tracing numpy x y poly       n   len poly      inside   np zeros len x  np bool       p2x   0 0     p2y   0 0     xints   0 0     p1x p1y   poly 0      for i in range n 1           p2x p2y   poly i   n          idx   np nonzero  y  gt  min p1y p2y    amp   y  lt   max p1y p2y    amp   x  lt   max p1x p2x    0          if p1y    p2y              xints    y idx -p1y   p2x-p1x   p2y-p1y  p1x         if p1x    p2x              inside idx     inside idx          else              idxx   idx x idx   lt   xints              inside idxx     inside idxx               p1x p1y   p2x p2y     return inside       Wrapped ray tracing into   def ray tracing mult x y poly       return  ray tracing xi  yi  poly  -1     for xi yi in zip x y     Tested on 100000 points  results   ray tracing mult 0 00 00 850656 ray tracing numpy 0 00 00 003769

User · Answer

Your test is good  but it measures only some specific situation  we have one polygon with many vertices  and long array of points to check them within polygon   Moreover  I suppose that you re measuring not  matplotlib-inside-polygon-method vs ray-method  but  matplotlib-somehow-optimized-iteration vs simple-list-iteration  Let s make N independent comparisons  N pairs of point and polygon          your code    lenpoly   100 polygon     np sin x  0 5 np cos x  0 5  for x in np linspace 0 2 np pi lenpoly   -1    M   10000 start time   time     Ray tracing for i in range M       x y   np random random    np random random       inside1   ray tracing method x y  polygon  print  Ray Tracing Elapsed time      str time  -start time     Matplotlib mplPath start time   time   for i in range M       x y   np random random    np random random       inside2   path contains points   x y    print  Matplotlib contains points Elapsed time      str time  -start time    Result   Ray Tracing Elapsed time  0 548588991165 Matplotlib contains points Elapsed time  0 103765010834   Matplotlib is still much better  but not 100 times better  Now let s try much simpler polygon     lenpoly   5       same code   result   Ray Tracing Elapsed time  0 0727779865265 Matplotlib contains points Elapsed time  0 105288982391

[python] What's the fastest way of checking if a point is inside a polygon in python

Edit Jan/4/2021

Edit 17 Feb '21:

Edit 20 Feb '21:

Examples related to python

Examples related to matplotlib