Skip rows during csv import pandas

Question

I m trying to import a  csv file using pandas read csv    however I don t want to import the 2nd row of the data file  the row with index   1 for 0-indexing    I can t see how not to import it because the arguments used with the command seem ambiguous   From the pandas website       skiprows   list-like or integer      Row numbers to skip  0-indexed  or number of rows to skip  int  at the   start of the file     If I put skiprows 1 in the arguments  how does it know whether to skip the first row or skip the row with index 1

User · Answer

All of these answers miss one important point -- the n'th line is the n'th line in the file, and not the n'th row in the dataset. I have a situation where I download some antiquated stream gauge data from the USGS. The head of the dataset is commented with '#', the first line after that are the labels, next comes a line that describes the date types, and last the data itself. I never know how many comment lines there are, but I know what the first couple of rows are. Example:

----------------------------- WARNING ----------------------------------

Some of the data that you have obtained from this U.S. Geological Survey database

may not have received Director's approval. ... agency_cd site_no datetime tz_cd 139719_00065 139719_00065_cd

5s 15s 20d 6s 14n 10s USGS 08041780 2018-05-06 00:00 CDT 1.98 A

It would be nice if there was a way to automatically skip the n'th row as well as the n'th line.

As a note, I was able to fix my issue with:

import pandas as pd
ds = pd.read_csv(fname, comment='#', sep='\t', header=0, parse_dates=True)
ds.drop(0, inplace=True)

User · Answer

You can try yourself    gt  gt  gt  import pandas as pd  gt  gt  gt  from StringIO import StringIO  gt  gt  gt  s      1  2     3  4     5  6     gt  gt  gt  pd read csv StringIO s   skiprows  1   header None     0  1 0  1  2 1  5  6  gt  gt  gt  pd read csv StringIO s   skiprows 1  header None     0  1 0  3  4 1  5  6

User · Answer

I don t have reputation to comment yet  but I want to add to alko answer for further reference   From the docs      skiprows  A collection of numbers for rows in the file to skip  Can also be an integer to skip the first n rows

User · Answer

skip 1  will skip second line  not the first one

User · Answer

Also be sure that your file is actually a CSV file  For example  if you had an  xls file  and simply changed the file extension to  csv  the file won t import and will give the error above   To check to see if this is your problem open the file in excel and it will likely say    The file format and extension of  Filename csv  don t match   The file could be corrupted or unsafe   Unless you trust its source  don t open it  Do you want to open it anyway    To fix the file  open the file in Excel  click  Save As   Choose the file format to save as  use  cvs   then replace the existing file    This was my problem  and fixed the error for me

User · Answer

I got the same issue while running the skiprows while reading the csv file  I was doning skip rows 1 this will not work  Simple example gives an idea how to use skiprows while reading csv file   import pandas as pd   skiprows 1 will skip first line and try to read from second line df   pd read csv  my csv file csv   skiprows 1      pandas as pd   print the data frame df

[python] Skip rows during csv import pandas

----------------------------- WARNING ----------------------------------

Some of the data that you have obtained from this U.S. Geological Survey database

may not have received Director's approval. ... agency_cd site_no datetime tz_cd 139719_00065 139719_00065_cd

Examples related to python

Examples related to csv

Examples related to pandas