Way to read first few lines for pandas dataframe

Question

Is there a built-in way to use read csv to read only the first n lines of a file without knowing the length of the lines ahead of time  I have a large file that takes a long time to read  and occasionally only want to use the first  say  20 lines to get a sample of it  and prefer not to load the full thing and take the head of it    If I knew the total number of lines I could do something like footer lines   total lines - n and pass this to the skipfooter keyword arg  My current solution is to manually grab the first n lines with python and StringIO it to pandas   import pandas as pd from StringIO import StringIO  n   20 with open  big file csv    r   as f      head      join f readlines n    df   pd read csv StringIO head     It s not that bad  but is there a more concise   pandasic      way to do it with keywords or something

User · Accepted Answer

I think you can use the nrows parameter   From the docs   nrows   int  default None      Number of rows of file to read  Useful for reading pieces of large files   which seems to work   Using one of the standard large test files  988504479 bytes  5344499 lines    In  1   import pandas as pd  In  2   time z   pd read csv  P00000001-ALL csv   nrows 20  CPU times  user 0 00 s  sys  0 00 s  total  0 00 s Wall time  0 00 s  In  3   len z  Out 3   20  In  4   time z   pd read csv  P00000001-ALL csv   CPU times  user 27 63 s  sys  1 92 s  total  29 55 s Wall time  30 23 s

[python] Way to read first few lines for pandas dataframe

Examples related to python

Examples related to pandas

Examples related to csv

Examples related to dataframe