[linux] Count lines in large files

As per my test, I can verify that the Spark-Shell (based on Scala) is way faster than the other tools (GREP, SED, AWK, PERL, WC). Here is the result of the test that I ran on a file which had 23782409 lines

time grep -c $ my_file.txt;

real 0m44.96s user 0m41.59s sys 0m3.09s

time wc -l my_file.txt;

real 0m37.57s user 0m33.48s sys 0m3.97s

time sed -n '$=' my_file.txt;

real 0m38.22s user 0m28.05s sys 0m10.14s

time perl -ne 'END { $_=$.;if(!/^[0-9]+$/){$_=0;};print "$_" }' my_file.txt;

real 0m23.38s user 0m20.19s sys 0m3.11s

time awk 'END { print NR }' my_file.txt;

real 0m19.90s user 0m16.76s sys 0m3.12s

spark-shell
import org.joda.time._
val t_start = DateTime.now()
sc.textFile("file://my_file.txt").count()
val t_end = DateTime.now()
new Period(t_start, t_end).toStandardSeconds()

res1: org.joda.time.Seconds = PT15S