pandas to numeric for multiple columns

Question

I m working with the following df   c sort values  2005   ascending False  head 3        GeoName ComponentName     IndustryId IndustryClassification Description                                2004 2005  2006  2007  2008  2009 2010 2011 2012 2013 2014 37926 Alabama Real GDP by state 9          213                    Support activities for mining              99   98    117   117   115   87   96   95   103  102   NA  37951 Alabama Real GDP by state 34         42                     Wholesale trade                            9898 10613 10952 11034 11075 9722 9765 9703 9600 9884 10199 37932 Alabama Real GDP by state 15         327                    Nonmetallic mineral products manufacturing 980  968   940   1084  861   724  714  701  589  641   NA    I want to force numeric on all of the years   c  2014     pd to numeric c  2014    errors  coerce     is there an easy way to do this or do I have to type them all out

User · Answer

df cols    pd to numeric df cols  stack    errors  coerce   unstack

User · Answer

You can use   print df columns 5   Index  u 2004   u 2005   u 2006   u 2007   u 2008   u 2009   u 2010   u 2011          u 2012   u 2013   u 2014          dtype  object    for col in  df columns 5        df col    pd to numeric df col   errors  coerce    print df        GeoName      ComponentName  IndustryId  IndustryClassification    37926  Alabama  Real GDP by state           9                     213    37951  Alabama  Real GDP by state          34                      42    37932  Alabama  Real GDP by state          15                     327                                           Description  2004   2005   2006   2007    37926               Support activities for mining    99     98    117    117    37951                            Wholesale  trade  9898  10613  10952  11034    37932  Nonmetallic mineral products manufacturing   980    968    940   1084             2008  2009  2010  2011  2012  2013     2014   37926    115    87    96    95   103   102      NaN   37951  11075  9722  9765  9703  9600  9884  10199 0   37932    861   724   714   701   589   641      NaN     Another solution with filter   print df filter like  20          2004   2005   2006   2007   2008  2009  2010  2011  2012  2013   2014 37926    99     98    117    117    115    87    96    95   103   102    NA  37951  9898  10613  10952  11034  11075  9722  9765  9703  9600  9884  10199 37932   980    968    940   1084    861   724   714   701   589   641    NA   for col in  df filter like  20   columns      df col    pd to numeric df col   errors  coerce   print df        GeoName      ComponentName  IndustryId  IndustryClassification    37926  Alabama  Real GDP by state           9                     213    37951  Alabama  Real GDP by state          34                      42    37932  Alabama  Real GDP by state          15                     327                                           Description  2004   2005   2006   2007    37926               Support activities for mining    99     98    117    117    37951                            Wholesale  trade  9898  10613  10952  11034    37932  Nonmetallic mineral products manufacturing   980    968    940   1084             2008  2009  2010  2011  2012  2013     2014   37926    115    87    96    95   103   102      NaN   37951  11075  9722  9765  9703  9600  9884  10199 0   37932    861   724   714   701   589   641      NaN

User · Answer

UPDATE  you don t need to convert your values afterwards  you can do it on-the-fly when reading your CSV   In  165   df pd read csv url  index col 0  na values    NA     fillna 0   In  166   df dtypes Out 166   GeoName                    object ComponentName              object IndustryId                  int64 IndustryClassification     object Description                object 2004                        int64 2005                        int64 2006                        int64 2007                        int64 2008                        int64 2009                        int64 2010                        int64 2011                        int64 2012                        int64 2013                        int64 2014                      float64 dtype  object   If you need to convert multiple columns to numeric dtypes - use the following technique   Sample source DF   In  271   df Out 271        id    a  b  c  d  e    f 0  id 3  AAA  6  3  5  8    1 1  id 9    3  7  5  7  3  BBB 2  id 7    4  2  3  5  4    2 3  id 0    7  3  5  7  9    4 4  id 0    2  4  6  4  0    2  In  272   df dtypes Out 272   id    object a     object b      int64 c      int64 d      int64 e      int64 f     object dtype  object   Converting  selected columns to numeric dtypes   In  273   cols   df columns drop  id    In  274   df cols    df cols  apply pd to numeric  errors  coerce    In  275   df Out 275        id    a  b  c  d  e    f 0  id 3  NaN  6  3  5  8  1 0 1  id 9  3 0  7  5  7  3  NaN 2  id 7  4 0  2  3  5  4  2 0 3  id 0  7 0  3  5  7  9  4 0 4  id 0  2 0  4  6  4  0  2 0  In  276   df dtypes Out 276   id     object a     float64 b       int64 c       int64 d       int64 e       int64 f     float64 dtype  object   PS if you want to select all string  object  columns use the following simple trick   cols   df columns df dtypes eq  object

User · Answer

If you are looking for a range of columns  you can try this   df iloc 7     df iloc 7   astype float    The examples above will convert type to be float  for all the columns begin with the 7th to the end  You of course can use different type or different range   I think this is useful when you have a big range of columns to convert and a lot of rows  It doesn t make you go over each row by yourself - I believe numpy do it more efficiently   This is useful only if you know that all the required columns contain numbers only - it will not change  bad values   like string  to be NaN for you

User · Answer

df loc    col      df loc    col    apply pd to numeric  errors    coerce

User · Answer

another way is using apply  one liner   cols     col1    col2    col3   data cols    data cols  apply pd to numeric  errors  coerce   axis 1

[python] pandas: to_numeric for multiple columns

Examples related to python

Examples related to pandas