dropping infinite values from dataframes in pandas

Question

what is the quickest simplest way to drop nan and inf -inf values from a pandas DataFrame without resetting mode use inf as null  I d like to be able to use the subset and how arguments of dropna  except with inf values considered missing  like   df dropna subset   col1    col2    how  all   with inf True    is this possible  Is there a way to tell dropna to include inf in its definition of missing values

User · Answer

Here is another method using  loc to replace inf with nan on a Series   s loc   np isfinite s    amp  s notnull      np nan   So  in response to the original question   df   pd DataFrame np ones  3  3    columns list  ABC     for i in range 3        df iat i  i    np inf  df           A         B         C 0       inf  1 000000  1 000000 1  1 000000       inf  1 000000 2  1 000000  1 000000       inf  df sum   A    inf B    inf C    inf dtype  float64  df apply lambda s  s np isfinite s   dropna    sum   A    2 B    2 C    2 dtype  float64

User · Answer

You can use pd DataFrame mask with np isinf  You should ensure first your dataframe series are all of type float  Then use dropna with your existing logic   print df          col1      col2 0 -0 441406       inf 1 -0 321105      -inf 2 -0 412857  2 223047 3 -0 356610  2 513048  df   df mask np isinf df    print df          col1      col2 0 -0 441406       NaN 1 -0 321105       NaN 2 -0 412857  2 223047 3 -0 356610  2 513048

User · Answer

With option context  this is possible without permanently setting use inf as na  For example   with pd option context  mode use inf as na   True       df   df dropna subset   col1    col2    how  all     Of course it can be set to treat inf as NaN permanently with   pd set option  use inf as na   True      For older versions  replace use inf as na with use inf as null

User · Answer

The above solution will modify the infs that are not in the target columns  To remedy that   lst    np inf  -np inf  to replace    v  lst for v in   col1    col2    df replace to replace  np nan

User · Answer

The simplest way would be to first replace infs to NaN   df replace  np inf  -np inf   np nan    and then use the dropna   df replace  np inf  -np inf   np nan  dropna subset   col1    col2    how  all     For example   In  11   df   pd DataFrame  1  2  np inf  -np inf    In  12   df replace  np inf  -np inf   np nan  Out 12       0 0   1 1   2 2 NaN 3 NaN   The same method would work for a Series

User · Answer

Yet another solution would be to use the isin method  Use it to determine whether each value is infinite or missing and then chain the all method to determine if all the values in the rows are infinite or missing   Finally  use the negation of that result to select the rows that don t have all infinite or missing values via boolean indexing   all inf or nan   df isin  np inf  -np inf  np nan   all axis  columns   df  all inf or nan

User · Answer

Use  fast and simple    df   df np isfinite df  all 1     This answer is based on DougR s answer in an other question  Here an example code   import pandas as pd import numpy as np df pd DataFrame  1 2 3 np nan 4 np inf 5 -np inf 6   print  Input  n  df sep     df   df np isfinite df  all 1   print   nDropped  n  df sep       Result   Input      0 0  1 0000 1  2 0000 2  3 0000 3     NaN 4  4 0000 5     inf 6  5 0000 7    -inf 8  6 0000  Dropped       0 0  1 0 1  2 0 2  3 0 4  4 0 6  5 0 8  6 0

[python] dropping infinite values from dataframes in pandas?

Examples related to python

Examples related to numpy

Examples related to scipy

Examples related to pandas