Convert pandas Series from dtype object to float and errors to nans

Question

Consider the following situation   In  2   a   pd Series  1 2 3 4        In  3   a Out 3    0    1 1    2 2    3 3    4 4      dtype  object  In  8   a astype  float64   raise on error   False  Out 8    0    1 1    2 2    3 3    4 4      dtype  object   I would have expected an option that allows conversion while turning erroneous values  such as that    to NaNs  Is there a way to achieve this

User · Answer

In  30   pd Series  1 2 3 4       convert objects convert numeric True  Out 30    0     1 1     2 2     3 3     4 4   NaN dtype  float64

User · Answer

Use pd to numeric with errors  coerce     Setup s   pd Series   1    2    3    4         s  0    1 1    2 2    3 3    4 4      dtype  object     pd to numeric s  errors  coerce    0    1 0 1    2 0 2    3 0 3    4 0 4    NaN dtype  float64   If you need the NaNs filled in  use Series fillna   pd to numeric s  errors  coerce   fillna 0  downcast  infer    0    1 1    2 2    3 3    4 4    0 dtype  float64   Note  downcast  infer  will attempt to downcast floats to integers where possible  Remove the argument if you don t want that      From v0 24   pandas introduces a Nullable Integer type  which allows   integers to coexist with NaNs  If you have integers in your column    you can use  pd   version      0 24 1   pd to numeric s  errors  coerce   astype  Int32    0      1 1      2 2      3 3      4 4    NaN dtype  Int32       There are other options to choose from as well  read the docs for more      Extension for DataFrames  If you need to extend this to DataFrames  you will need to apply it to each row  You can do this using DataFrame apply      Setup  np random seed 0  df   pd DataFrame        A    np random choice 10  5         C    np random choice 10  5         B      1                 50   234          D      23    1           268            list  ABCD    df     A    B  C    D 0  5    1  9   23 1  0       3    1 2  3       5      3  3   50  2  268 4  7  234  4       df dtypes  A     int64 B    object C     int64 D    object dtype  object     df2   df apply pd to numeric  errors  coerce   df2     A      B  C      D 0  5    1 0  9   23 0 1  0    NaN  3    1 0 2  3    NaN  5    NaN 3  3   50 0  2  268 0 4  7  234 0  4    NaN  df2 dtypes  A      int64 B    float64 C      int64 D    float64 dtype  object   You can also do this with DataFrame transform  although my tests indicate this is marginally slower   df transform pd to numeric  errors  coerce       A      B  C      D 0  5    1 0  9   23 0 1  0    NaN  3    1 0 2  3    NaN  5    NaN 3  3   50 0  2  268 0 4  7  234 0  4    NaN     If you have many columns  numeric  non-numeric   you can make this a little more performant by applying pd to numeric on the non-numeric columns only   df dtypes eq object   A    False B     True C    False D     True dtype  bool  cols   df columns df dtypes eq object     Actually   cols  can be any list of columns you need to convert  cols   Index   B    D    dtype  object    df cols    df cols  apply pd to numeric  errors  coerce     Alternatively    for c in cols        df c    pd to numeric df c   errors  coerce    df     A      B  C      D 0  5    1 0  9   23 0 1  0    NaN  3    1 0 2  3    NaN  5    NaN 3  3   50 0  2  268 0 4  7  234 0  4    NaN   Applying pd to numeric along the columns  i e   axis 0  the default  should be slightly faster for long DataFrames

[python] Convert pandas.Series from dtype object to float, and errors to nans

Examples related to python

Examples related to pandas

Examples related to nan