What s the fastest way to read a text file line-by-line

Question

I want to read a text file line by line  I wanted to know if I m doing it as efficiently as possible within the  NET C  scope of things   This is what I m trying so far   var filestream   new System IO FileStream textFilePath                                            System IO FileMode Open                                            System IO FileAccess Read                                            System IO FileShare ReadWrite   var file   new System IO StreamReader filestream  System Text Encoding UTF8  true  128    while   lineOfText   file ReadLine       null          Do something with the lineOfText

User · Answer

There s a good topic about this in Stack Overflow question Is  yield return  slower than  old school  return    It says      ReadAllLines loads all of the lines into memory and returns a   string    All well and good if the file is small  If the file is   larger than will fit in memory  you ll run out of memory       ReadLines  on the other hand  uses yield return to return one line at   a time  With it  you can read any size file  It doesn t load the whole   file into memory       Say you wanted to find the first line that contains the word  foo     and then exit  Using ReadAllLines  you d have to read the entire file   into memory  even if  foo  occurs on the first line  With ReadLines    you only read one line  Which one would be faster

User · Answer

While File ReadAllLines   is one of the simplest ways to read a file  it is also one of the slowest   If you re just wanting to read lines in a file without doing much  according to these benchmarks  the fastest way to read a file is the age old method of   using  StreamReader sr   File OpenText fileName             string s   String Empty          while   s   sr ReadLine       null                             do minimal amount of work here               However  if you have to do a lot with each line  then this article concludes that the best way is the following  and it s faster to pre-allocate a string   if you know how many lines you re going to read     AllLines   new string MAX     only allocate memory here  using  StreamReader sr   File OpenText fileName             int x   0          while   sr EndOfStream                           AllLines x    sr ReadLine                   x    1                Finished  Close the file    Now parallel process each line in the file Parallel For 0  AllLines Length  x   gt        DoYourStuff AllLines x      do your work here

User · Answer

If the file size is not big  then it is faster to read the entire file and split it afterwards  var filestreams   sr ReadToEnd   Split Environment NewLine                                 StringSplitOptions RemoveEmptyEntries

User · Answer

You can t get any faster if you want to use an existing API to read the lines  But reading larger chunks and manually find each new line in the read buffer would probably be faster

User · Answer

If you re using  NET 4  simply use File ReadLines which does it all for you  I suspect it s much the same as yours  except it may also use FileOptions SequentialScan and a larger buffer  128 seems very small

User · Answer

To find the fastest way to read a file line by line you will have to do some benchmarking  I have done some small tests on my computer but you cannot expect that my results apply to your environment   Using StreamReader ReadLine  This is basically your method  For some reason you set the buffer size to the smallest possible value  128   Increasing this will in general increase performance  The default size is 1 024 and other good choices are 512  the sector size in Windows  or 4 096  the cluster size in NTFS   You will have to run a benchmark to determine an optimal buffer size  A bigger buffer is - if not faster - at least not slower than a smaller buffer   const Int32 BufferSize   128  using  var fileStream   File OpenRead fileName     using  var streamReader   new StreamReader fileStream  Encoding UTF8  true  BufferSize         String line      while   line   streamReader ReadLine       null           Process line       The FileStream constructor allows you to specify FileOptions  For example  if you are reading a large file sequentially from beginning to end  you may benefit from FileOptions SequentialScan  Again  benchmarking is the best thing you can do   Using File ReadLines  This is very much like your own solution except that it is implemented using a StreamReader with a fixed buffer size of 1 024  On my computer this results in slightly better performance compared to your code with the buffer size of 128  However  you can get the same performance increase by using a larger buffer size  This method is implemented using an iterator block and does not consume memory for all lines   var lines   File ReadLines fileName   foreach  var line in lines       Process line   Using File ReadAllLines  This is very much like the previous method except that this method grows a list of strings used to create the returned array of lines so the memory requirements are higher  However  it returns String   and not an IEnumerable lt String gt  allowing you to randomly access the lines   var lines   File ReadAllLines fileName   for  var i   0  i  lt  lines Length  i    1      var line   lines i        Process line     Using String Split  This method is considerably slower  at least on big files  tested on a 511 KB file   probably due to how String Split is implemented  It also allocates an array for all the lines increasing the memory required compared to your solution   using  var streamReader   File OpenText fileName       var lines   streamReader ReadToEnd   Split   r n  ToCharArray    StringSplitOptions RemoveEmptyEntries     foreach  var line in lines         Process line     My suggestion is to use File ReadLines because it is clean and efficient  If you require special sharing options  for example you use FileShare ReadWrite   you can use your own code but you should increase the buffer size

User · Answer

If you have enough memory  I ve found some performance gains by reading the entire file into a memory stream  and then opening a stream reader on that to read the lines   As long as you actually plan on reading the whole file anyway  this can yield some improvements

User · Answer

Use the following code   foreach  string line in File ReadAllLines fileName     This was a HUGE difference in reading performance   It comes at the cost of memory consumption  but totally worth it

[c#] What's the fastest way to read a text file line-by-line?

Examples related to c#

Examples related to .net

Examples related to performance

Examples related to file-io

Examples related to text-files