Python CSV error line contains NULL byte

Question

I m working with some CSV files  with the following code   reader   csv reader open filepath   rU    try      for row in reader          print  Row read successfully    row except csv Error  e      sys exit  file  s  line  d   s     filename  reader line num  e     And one file is throwing this error    file my csv  line 1  line contains NULL byte   What can I do  Google seems to suggest that it may be an Excel file that s been saved as a  csv improperly  Is there any way I can get round this problem in Python      UPDATE     Following  JohnMachin s comment below  I tried adding these lines to my script    print repr open filepath   rb   read 200     dump 1st 200 bytes of file data   open filepath   rb   read   print data find   x00   print data count   x00     And this is the output I got      xd0 xcf x11 xe0 xa1 xb1 x1a xe1 x00 x00 x00 x00 x00 x00 x00 x00        lt snip gt  8 13834   So the file does indeed contain NUL bytes

User · Answer

Instead of csv reader I use read file and split function for string   lines   open input file  rb     for line all in lines       line line all replace   x00       split

User · Answer

appparently it s a XLS file and not a CSV file as http   www garykessler net library file sigs html confirm

User · Answer

As  S Lott says  you should be opening your files in  rb  mode  not  rU  mode  However that may NOT be causing your current problem  As far as I know  using  rU  mode would mess you up if there are embedded  r in the data  but not cause any other dramas  I also note that you have several files  all opened with  rU      but only one causing a problem   If the csv module says that you have a  NULL   silly message  should be  NUL   byte in your file  then you need to check out what is in your file  I would suggest that you do this even if using  rb  makes the problem go away   repr   is  or wants to be  your debugging friend  It will show unambiguously what you ve got  in a platform independant fashion  which is helpful to helpers who are unaware what od is or does   Do this   print repr open  my csv    rb   read 200     dump 1st 200 bytes of file   and carefully copy paste  don t retype  the result into an edit of your question  not into a comment    Also note that if the file is really dodgy e g  no  r or  n within reasonable distance from the start of the file  the line number reported by reader line num will be  unhelpfully  1  Find where the first  x00 is  if any  by doing  data   open  my csv    rb   read   print data find   x00     and make sure that you dump at least that many bytes with repr or od   What does data count   x00   tell you  If there are many  you may want to do something like  for i  c in enumerate data       if c      x00           print i  repr data i-30 i        NUL      repr data i 1 i 31     so that you can see the NUL bytes in context   If you can see  x00 in the output  or  0 in your od -c output   then you definitely have NUL byte s  in the file  and you will need to do something like this   fi   open  my csv    rb   data   fi read   fi close   fo   open  mynew csv    wb   fo write data replace   x00        fo close     By the way  have you looked at the file  including the last few lines  with a text editor  Does it actually look like a reasonable CSV file like the other  no  NULL byte  exception  files

User · Answer

I bumped into this problem as well   Using the Python csv module  I was trying to read an XLS file created in MS Excel and running into the NULL byte error you were getting   I looked around and found the xlrd Python module for reading and formatting data from MS Excel spreadsheet files   With the xlrd module  I am not only able to read the file properly  but I can also access many different parts of the file in a way I couldn t before   I thought it might help you

User · Answer

data initial   open  staff csv    rb   data   csv reader  line replace   0      for line in data initial   delimiter        This works for me

User · Answer

For all those  rU  filemode haters  I just tried opening a CSV file from a Windows machine on a Mac with the  rb  filemode and I got this error from the csv module   Error  new-line character seen in unquoted field - do you need to  open the file in universal-newline mode    Opening the file in  rU  mode works fine  I love universal-newline mode -- it saves me so much hassle

User · Answer

Why are you doing this    reader   csv reader open filepath   rU      The docs are pretty clear that you must do this   with open filepath   rb   as src      reader  csv reader  src     The mode must be  rb  to read   http   docs python org library csv html csv reader     If csvfile is a file object  it must be opened with the    b    flag on platforms where that makes a difference

User · Answer

One case is that - If the CSV file contains empty rows this error may show up  Check for row is necessary before we proceed to write or read   for row in csvreader          if  row                      do something   I solved my issue by adding this check in the code

User · Answer

Have you tried using gzip open   with gzip open  my csv    rb   as data file    I was trying to open a file that had been compressed but had the extension   csv  instead of  csv gz   This error kept showing up until I used gzip open

User · Answer

I encountered this when using scrapy and fetching a zipped csvfile without having a correct middleware to unzip the response body before handing it to the csvreader  Hence the file was not really a csv file and threw the line contains NULL byte error accordingly

User · Answer

I had the same problem opening a CSV produced from a webservice which inserted NULL bytes in empty headers  I did the following to clean the file   with codecs open   my csv    rb    utf-8   as myfile      data   myfile read         clean file first if dirty     if data count    x00             print  Cleaning             with codecs open  my csv tmp    w    utf-8   as of              for line in data                  of write line replace   x00                 shutil move   my csv tmp    my csv     with codecs open   my csv    rb    utf-8   as myfile      myreader   csv reader myfile  delimiter            Continue with your business logic here      Disclaimer  Be aware that this overwrites your original data  Make sure you have a backup copy of it  You have been warned

User · Answer

Reading it as UTF-16 was also my problem   Here s my code that ended up working   f codecs open location  rb   utf-16   csvread csv reader f delimiter   t   csvread next   for row in csvread      print row   Where location is the directory of your csv file

User · Answer

I got the same error  Saved the file in UTF-8 and it worked

User · Answer

This happened to me when I created a CSV file with OpenOffice Calc  It didn t happen when I created the CSV file in my text editor  even if I later edited it with Calc   I solved my problem by copy-pasting in my text editor the data  from my Calc-created file to a new editor-created file

User · Answer

Converting the encoding of the source file from UTF-16 to UTF-8 solve my problem   How to convert a file to utf-8 in Python   import codecs BLOCKSIZE   1048576   or some other  desired size in bytes with codecs open sourceFileName   r    utf-16   as sourceFile      with codecs open targetFileName   w    utf-8   as targetFile          while True              contents   sourceFile read BLOCKSIZE              if not contents                  break             targetFile write contents

User · Answer

You could just inline a generator to filter out the null values if you want to pretend they don t exist   Of course this is assuming the null bytes are not really part of the encoding and really are some kind of erroneous artifact or bug   with open filepath   rb   as f      reader   csv reader   line replace   0      for line in f         try          for row in reader              print  Row read successfully    row     except csv Error  e          sys exit  file  s  line  d   s     filename  reader line num  e

[python] Python CSV error: line contains NULL byte

Examples related to python

Examples related to csv