[python] Reading tab-delimited file with Pandas - works on Windows, but not on Mac

I've been reading a tab-delimited data file in Windows with Pandas/Python without any problems. The data file contains notes in first three lines and then follows with a header.

df = pd.read_csv(myfile,sep='\t',skiprows=(0,1,2),header=(0))

I'm now trying to read this file with my Mac. (My first time using Python on Mac.) I get the following error.

pandas.parser.CParserError: Error tokenizing data. C error: Expected 1
fields in line 8, saw 39

If set the error_bad_lines argument for read_csv to False, I get the following information, which continues until the end of the last row.

Skipping line 8: expected 1 fields, saw 39
Skipping line 9: expected 1 fields, saw 125
Skipping line 10: expected 1 fields, saw 125
Skipping line 11: expected 1 fields, saw 125
Skipping line 12: expected 1 fields, saw 125
Skipping line 13: expected 1 fields, saw 125
Skipping line 14: expected 1 fields, saw 125
Skipping line 15: expected 1 fields, saw 125
Skipping line 16: expected 1 fields, saw 125
Skipping line 17: expected 1 fields, saw 125
...

Do I need to specify a value for the encoding argument? It seems as though I shouldn't have to because reading the file works fine on Windows.

This question is related to python macos pandas import tab-delimited

The answer is


The biggest clue is the rows are all being returned on one line. This indicates line terminators are being ignored or are not present.

You can specify the line terminator for csv_reader. If you are on a mac the lines created will end with \rrather than the linux standard \n or better still the suspenders and belt approach of windows with \r\n.

pandas.read_csv(filename, sep='\t', lineterminator='\r')

You could also open all your data using the codecs package. This may increase robustness at the expense of document loading speed.

import codecs

doc = codecs.open('document','rU','UTF-16') #open for reading with "universal" type set

df = pandas.read_csv(doc, sep='\t')

Another option would be to add engine='python' to the command pandas.read_csv(filename, sep='\t', engine='python')


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to macos

Problems with installation of Google App Engine SDK for php in OS X dyld: Library not loaded: /usr/local/opt/openssl/lib/libssl.1.0.0.dylib dyld: Library not loaded: /usr/local/opt/icu4c/lib/libicui18n.62.dylib error running php after installing node with brew on Mac Could not install packages due to an EnvironmentError: [Errno 13] How do I install Java on Mac OSX allowing version switching? Git is not working after macOS Update (xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools) Can't compile C program on a Mac after upgrade to Mojave You don't have write permissions for the /Library/Ruby/Gems/2.3.0 directory. (mac user) How can I install a previous version of Python 3 in macOS using homebrew? Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'"

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float

Examples related to import

Import functions from another js file. Javascript The difference between "require(x)" and "import x" pytest cannot import module while python can How to import an Excel file into SQL Server? When should I use curly braces for ES6 import? How to import a JSON file in ECMAScript 6? Python: Importing urllib.quote importing external ".txt" file in python beyond top level package error in relative import Reading tab-delimited file with Pandas - works on Windows, but not on Mac

Examples related to tab-delimited

Reading tab-delimited file with Pandas - works on Windows, but not on Mac String parsing in Java with delimiter tab "\t" using split Sorting a tab delimited file