Writing a pandas DataFrame to CSV file

Question

I have a dataframe in pandas which I would like to write to a CSV file  I am doing this using   df to csv  out csv     And getting the error   UnicodeEncodeError   ascii  codec can t encode character u  u03b1  in position 20  ordinal not in range 128    Is there any way to get around this easily  i e  I have unicode characters in my data frame   And is there a way to write to a tab delimited file instead of a CSV using e g  a  to-tab  method  that I dont think exists

User · Answer

Sometimes you face these problems if you specify UTF-8 encoding also  I recommend you to specify encoding while reading file and same encoding while writing to file  This might solve your problem

User · Answer

Example of export in file with full path on Windows and in case your file has headers  df to csv  r C  Users John Desktop export dataframe csv   index   None  header True    For example  if you want to store the file in same directory where your script is  with utf-8 encoding and tab as separator  df to csv r   export dftocsv csv   sep   t   encoding  utf-8   header  true

User · Answer

To write a pandas DataFrame to a CSV file  you will need DataFrame to csv  This function offers many arguments with reasonable defaults that you will more often than not need to override to suit your specific use case  For example  you might want to use a different separator  change the datetime format  or drop the index when writing  to csv has arguments you can pass to address these requirements  Here s a table listing some common scenarios of writing to CSV files and the corresponding arguments you can use for them    Footnotes  The default separator is assumed to be a comma        Don t change this unless you know you need to  By default  the index of df is written as the first column  If your DataFrame does not have an index  IOW  the df index is the default RangeIndex   then you will want to set index False when writing  To explain this in a different way  if your data DOES have an index  you can  and should  use index True or just leave it out completely  as the default is True   It would be wise to set this parameter if you are writing string data so that other applications know how to read your data  This will also avoid any potential UnicodeEncodeErrors you might encounter while saving  Compression is recommended if you are writing large DataFrames   gt 100K rows  to disk as it will result in much smaller output files  OTOH  it will mean the write time will increase  and consequently  the read time since the file will need to be decompressed

User · Answer

it could be not the answer for this case  but as I had the same error-message with  to csvI tried  toCSV  name csv   and the error-message was different   SparseDataFrame  object has no attribute  toCSV    So the problem was solved by turning dataframe to dense dataframe  df to dense   to csv  submission csv   index   False  sep      encoding  utf-8

User · Answer

When you are storing a DataFrame object into a csv file using the to csv method  you probably wont be needing to store the preceding indices of each row of the DataFrame object   You can avoid that by passing a False boolean value to index parameter   Somewhat like   df to csv file name  encoding  utf-8   index False    So if your DataFrame object is something like     Color  Number 0   red     22 1  blue     10   The csv file will store   Color Number red 22 blue 10   instead of  the case when the default value True was passed    Color Number 0 red 22 1 blue 10

User · Answer

Something else you can try if you are having issues encoding to  utf-8  and want to go cell by cell you could try the following    Python 2   Where  df  is your DataFrame object    for column in df columns      for idx in df column  index          x   df get value idx column          try              x   unicode x encode  utf-8   ignore   errors   ignore   if type x     unicode else unicode str x  errors  ignore               df set value idx column x          except Exception              print  encoding error   0   1   format idx column              df set value idx column                 continue   Then try   df to csv file name      You can check the encoding of the columns by   for column in df columns      print   0   1   format str type df column  0    str column     Warning  errors  ignore  will just omit the character e g   IN  unicode  Regenexx xae  errors  ignore   OUT  u Regenexx    Python 3  for column in df columns      for idx in df column  index          x   df get value idx column          try              x   x if type x     str else str x  encode  utf-8   ignore   decode  utf-8   ignore               df set value idx column x          except Exception              print  encoding error   0   1   format idx column               df set value idx column                 continue

User · Answer

To delimit by a tab you can use the sep argument of to csv   df to csv file name  sep   t     To use a specific encoding  e g   utf-8   use the encoding argument   df to csv file name  sep   t   encoding  utf-8

[python] Writing a pandas DataFrame to CSV file

Examples related to python

Examples related to csv

Examples related to pandas

Examples related to dataframe