Pandas ValueError cannot convert float NaN to integer

Question

I get ValueError  cannot convert float NaN to integer for following   df   pandas read csv  zoom11 csv   df   x      df   x    astype int     The  x  is obviously a column in the csv file  but I cannot spot any float NaN in the file  and dont get what does it mean by this   When I read the column as String  then it has values like -1 0 1    2000  all look very nice int numbers to me  When I read the column as float  then this can be loaded  Then it shows values as -1 0 0 0 etc  still there are no any NaN-s I tried with error bad lines   False and dtype parameter in read csv to no avail  It just cancels loading with same exception  The file is not small  10  M rows   so cannot inspect it manually  when I extract a small header part  then there is no error  but it happens with full file  So it is something in the file  but cannot detect what  Logically the csv should not have missing values  but even if there is some garbage then I would be ok to skip the rows  Or at least identify them  but I do not see way to scan through file and report conversion errors    Update  Using the hints in comments answers I got my data clean with this     x contained NaN df   df  df  x   isnull       Y contained some other garbage  so null check was not enough df   df df  y   str isnumeric       final conversion now worked df   x      df   x    astype int  df   y      df   y    astype int

User · Accepted Answer

For identifying NaN values use boolean indexing   print df df  x   isnull       Then for removing all non-numeric values use to numeric with parameter errors  coerce  - to replace non-numeric values to NaNs   df  x     pd to numeric df  x    errors  coerce     And for remove all rows with NaNs in column x use dropna   df   df dropna subset   x      Last convert values to ints   df  x     df  x   astype int

User · Answer

Also  even at the lastest versions of pandas if the column is object type you would have to convert into float first  something like  df  column name   astype np float  astype  quot Int32 quot    NB  You have to go through numpy float first and then to nullable Int32  for some reason  The size of the int if it s 32 or 64 depends on your variable  be aware you may loose some precision if your numbers are to big for the format

User · Answer

if you have null value then in doing mathematical operation you will get this error to resolve it use df  df  x   isnull   df   x    astype int  if you want your dataset to be unchangeable

User · Answer

ValueError  cannot convert float NaN to integer   From v0 24  you actually can  Pandas introduces Nullable Integer Data Types which allows integers to coexist with NaNs   Given a series of whole float numbers with missing data   s   pd Series  1 0  2 0  np nan  4 0   s  0    1 0 1    2 0 2    NaN 3    4 0 dtype  float64  s dtype   dtype  float64     You can convert it to a nullable int type  choose from one of Int16  Int32  or Int64  with   s2   s astype  Int32     note the  I  is uppercase s2  0      1 1      2 2    NaN 3      4 dtype  Int32  s2 dtype   Int32Dtype     Your column needs to have whole numbers for the cast to happen  Anything else will raise a TypeError   s   pd Series  1 1  2 0  np nan  4 0    s astype  Int32     TypeError  cannot safely cast non-equivalent float64 to int32

User · Answer

I know this has been answered but wanted to provide alternate solution for anyone in the future   You can use  loc to subset the dataframe by only values that are notnull    and then subset out the  x  column only  Take that same vector  and apply int  to it    If column x is float   df loc df  x   notnull     x     df loc df  x   notnull     x   apply int

[python] Pandas: ValueError: cannot convert float NaN to integer

Examples related to python

Examples related to pandas

Examples related to csv