[linux] Quick unix command to display specific lines in the middle of a file?

Trying to debug an issue with a server and my only log file is a 20GB log file (with no timestamps even! Why do people use System.out.println() as logging? In production?!)

Using grep, I've found an area of the file that I'd like to take a look at, line 347340107.

Other than doing something like

head -<$LINENUM + 10> filename | tail -20 

... which would require head to read through the first 347 million lines of the log file, is there a quick and easy command that would dump lines 347340100 - 347340200 (for example) to the console?

update I totally forgot that grep can print the context around a match ... this works well. Thanks!

This question is related to linux bash unix text

The answer is


If your line number is 100 to read

head -100 filename | tail -1

No there isn't, files are not line-addressable.

There is no constant-time way to find the start of line n in a text file. You must stream through the file and count newlines.

Use the simplest/fastest tool you have to do the job. To me, using head makes much more sense than grep, since the latter is way more complicated. I'm not saying "grep is slow", it really isn't, but I would be surprised if it's faster than head for this case. That'd be a bug in head, basically.


Get ack

Ubuntu/Debian install:

$ sudo apt-get install ack-grep

Then run:

$ ack --lines=$START-$END filename

Example:

$ ack --lines=10-20 filename

From $ man ack:

--lines=NUM
    Only print line NUM of each file. Multiple lines can be given with multiple --lines options or as a comma separated list (--lines=3,5,7). --lines=4-7 also works. 
    The lines are always output in ascending order, no matter the order given on the command line.

Building on Sklivvz' answer, here's a nice function one can put in a .bash_aliases file. It is efficient on huge files when printing stuff from the front of the file.

function middle()
{
    startidx=$1
    len=$2
    endidx=$(($startidx+$len))
    filename=$3

    awk "FNR>=${startidx} && FNR<=${endidx} { print NR\" \"\$0 }; FNR>${endidx} { print \"END HERE\"; exit }" $filename
}

If your line number is 100 to read

head -100 filename | tail -1

Get ack

Ubuntu/Debian install:

$ sudo apt-get install ack-grep

Then run:

$ ack --lines=$START-$END filename

Example:

$ ack --lines=10-20 filename

From $ man ack:

--lines=NUM
    Only print line NUM of each file. Multiple lines can be given with multiple --lines options or as a comma separated list (--lines=3,5,7). --lines=4-7 also works. 
    The lines are always output in ascending order, no matter the order given on the command line.

Use

x=`cat -n <file> | grep <match> | awk '{print $1}'`

Here you will get the line number where the match occurred.

Now you can use the following command to print 100 lines

awk -v var="$x" 'NR>=var && NR<=var+100{print}' <file>

or you can use "sed" as well

sed -n "${x},${x+100}p" <file>

sed will need to read the data too to count the lines. The only way a shortcut would be possible would there to be context/order in the file to operate on. For example if there were log lines prepended with a fixed width time/date etc. you could use the look unix utility to binary search through the files for particular dates/times


I prefer just going into less and

  • typing 50% to goto halfway the file,
  • 43210G to go to line 43210
  • :43210 to do the same

and stuff like that.

Even better: hit v to start editing (in vim, of course!), at that location. Now, note that vim has the same key bindings!


No there isn't, files are not line-addressable.

There is no constant-time way to find the start of line n in a text file. You must stream through the file and count newlines.

Use the simplest/fastest tool you have to do the job. To me, using head makes much more sense than grep, since the latter is way more complicated. I'm not saying "grep is slow", it really isn't, but I would be surprised if it's faster than head for this case. That'd be a bug in head, basically.


You can use the ex command, a standard Unix editor (part of Vim now), e.g.

  • display a single line (e.g. 2nd one):

    ex +2p -scq file.txt
    

    corresponding sed syntax: sed -n '2p' file.txt

  • range of lines (e.g. 2-5 lines):

    ex +2,5p -scq file.txt
    

    sed syntax: sed -n '2,5p' file.txt

  • from the given line till the end (e.g. 5th to the end of the file):

    ex +5,p -scq file.txt
    

    sed syntax: sed -n '2,$p' file.txt

  • multiple line ranges (e.g. 2-4 and 6-8 lines):

    ex +2,4p +6,8p -scq file.txt
    

    sed syntax: sed -n '2,4p;6,8p' file.txt

Above commands can be tested with the following test file:

seq 1 20 > file.txt

Explanation:

  • + or -c followed by the command - execute the (vi/vim) command after file has been read,
  • -s - silent mode, also uses current terminal as a default output,
  • q followed by -c is the command to quit editor (add ! to do force quit, e.g. -scq!).

Easy with perl! If you want to get line 1, 3 and 5 from a file, say /etc/passwd:

perl -e 'while(<>){if(++$l~~[1,3,5]){print}}' < /etc/passwd

# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3,  efficient on large files 

method 3 efficient on large files

fastest way to display specific lines


I am surprised only one other answer (by Ramana Reddy) suggested to add line numbers to the output. The following searches for the required line number and colours the output.

file=FILE
lineno=LINENO
wb="107"; bf="30;1"; rb="101"; yb="103"
cat -n ${file} | { GREP_COLORS="se=${wb};${bf}:cx=${wb};${bf}:ms=${rb};${bf}:sl=${yb};${bf}" grep --color -C 10 "^[[:space:]]\\+${lineno}[[:space:]]"; }

To display a line from a <textfile> by its <line#>, just do this:

perl -wne 'print if $. == <line#>' <textfile>

If you want a more powerful way to show a range of lines with regular expressions -- I won't say why grep is a bad idea for doing this, it should be fairly obvious -- this simple expression will show you your range in a single pass which is what you want when dealing with ~20GB text files:

perl -wne 'print if m/<regex1>/ .. m/<regex2>/' <filename>

(tip: if your regex has / in it, use something like m!<regex>! instead)

This would print out <filename> starting with the line that matches <regex1> up until (and including) the line that matches <regex2>.

It doesn't take a wizard to see how a few tweaks can make it even more powerful.

Last thing: perl, since it is a mature language, has many hidden enhancements to favor speed and performance. With this in mind, it makes it the obvious choice for such an operation since it was originally developed for handling large log files, text, databases, etc.


What about:

tail -n +347340107 filename | head -n 100

I didn't test it, but I think that would work.


I found two other solutions if you know the line number but nothing else (no grep possible):

Assuming you need lines 20 to 40,

sed -n '20,40p;41q' file_name

or

awk 'FNR>=20 && FNR<=40' file_name

With sed -e '1,N d; M q' you'll print lines N+1 through M. This is probably a bit better then grep -C as it doesn't try to match lines to a pattern.


You could try this command:

egrep -n "*" <filename> | egrep "<line number>"

With sed -e '1,N d; M q' you'll print lines N+1 through M. This is probably a bit better then grep -C as it doesn't try to match lines to a pattern.


I'd first split the file into few smaller ones like this

$ split --lines=50000 /path/to/large/file /path/to/output/file/prefix

and then grep on the resulting files.


With sed -e '1,N d; M q' you'll print lines N+1 through M. This is probably a bit better then grep -C as it doesn't try to match lines to a pattern.


To display a line from a <textfile> by its <line#>, just do this:

perl -wne 'print if $. == <line#>' <textfile>

If you want a more powerful way to show a range of lines with regular expressions -- I won't say why grep is a bad idea for doing this, it should be fairly obvious -- this simple expression will show you your range in a single pass which is what you want when dealing with ~20GB text files:

perl -wne 'print if m/<regex1>/ .. m/<regex2>/' <filename>

(tip: if your regex has / in it, use something like m!<regex>! instead)

This would print out <filename> starting with the line that matches <regex1> up until (and including) the line that matches <regex2>.

It doesn't take a wizard to see how a few tweaks can make it even more powerful.

Last thing: perl, since it is a mature language, has many hidden enhancements to favor speed and performance. With this in mind, it makes it the obvious choice for such an operation since it was originally developed for handling large log files, text, databases, etc.


No there isn't, files are not line-addressable.

There is no constant-time way to find the start of line n in a text file. You must stream through the file and count newlines.

Use the simplest/fastest tool you have to do the job. To me, using head makes much more sense than grep, since the latter is way more complicated. I'm not saying "grep is slow", it really isn't, but I would be surprised if it's faster than head for this case. That'd be a bug in head, basically.


I'd first split the file into few smaller ones like this

$ split --lines=50000 /path/to/large/file /path/to/output/file/prefix

and then grep on the resulting files.


With sed -e '1,N d; M q' you'll print lines N+1 through M. This is probably a bit better then grep -C as it doesn't try to match lines to a pattern.


I found two other solutions if you know the line number but nothing else (no grep possible):

Assuming you need lines 20 to 40,

sed -n '20,40p;41q' file_name

or

awk 'FNR>=20 && FNR<=40' file_name

You could try this command:

egrep -n "*" <filename> | egrep "<line number>"

Building on Sklivvz' answer, here's a nice function one can put in a .bash_aliases file. It is efficient on huge files when printing stuff from the front of the file.

function middle()
{
    startidx=$1
    len=$2
    endidx=$(($startidx+$len))
    filename=$3

    awk "FNR>=${startidx} && FNR<=${endidx} { print NR\" \"\$0 }; FNR>${endidx} { print \"END HERE\"; exit }" $filename
}

sed will need to read the data too to count the lines. The only way a shortcut would be possible would there to be context/order in the file to operate on. For example if there were log lines prepended with a fixed width time/date etc. you could use the look unix utility to binary search through the files for particular dates/times


What about:

tail -n +347340107 filename | head -n 100

I didn't test it, but I think that would work.


No there isn't, files are not line-addressable.

There is no constant-time way to find the start of line n in a text file. You must stream through the file and count newlines.

Use the simplest/fastest tool you have to do the job. To me, using head makes much more sense than grep, since the latter is way more complicated. I'm not saying "grep is slow", it really isn't, but I would be surprised if it's faster than head for this case. That'd be a bug in head, basically.


What about:

tail -n +347340107 filename | head -n 100

I didn't test it, but I think that would work.


I'd first split the file into few smaller ones like this

$ split --lines=50000 /path/to/large/file /path/to/output/file/prefix

and then grep on the resulting files.


Easy with perl! If you want to get line 1, 3 and 5 from a file, say /etc/passwd:

perl -e 'while(<>){if(++$l~~[1,3,5]){print}}' < /etc/passwd

I found two other solutions if you know the line number but nothing else (no grep possible):

Assuming you need lines 20 to 40,

sed -n '20,40p;41q' file_name

or

awk 'FNR>=20 && FNR<=40' file_name

I prefer just going into less and

  • typing 50% to goto halfway the file,
  • 43210G to go to line 43210
  • :43210 to do the same

and stuff like that.

Even better: hit v to start editing (in vim, of course!), at that location. Now, note that vim has the same key bindings!


You can use the ex command, a standard Unix editor (part of Vim now), e.g.

  • display a single line (e.g. 2nd one):

    ex +2p -scq file.txt
    

    corresponding sed syntax: sed -n '2p' file.txt

  • range of lines (e.g. 2-5 lines):

    ex +2,5p -scq file.txt
    

    sed syntax: sed -n '2,5p' file.txt

  • from the given line till the end (e.g. 5th to the end of the file):

    ex +5,p -scq file.txt
    

    sed syntax: sed -n '2,$p' file.txt

  • multiple line ranges (e.g. 2-4 and 6-8 lines):

    ex +2,4p +6,8p -scq file.txt
    

    sed syntax: sed -n '2,4p;6,8p' file.txt

Above commands can be tested with the following test file:

seq 1 20 > file.txt

Explanation:

  • + or -c followed by the command - execute the (vi/vim) command after file has been read,
  • -s - silent mode, also uses current terminal as a default output,
  • q followed by -c is the command to quit editor (add ! to do force quit, e.g. -scq!).

I found two other solutions if you know the line number but nothing else (no grep possible):

Assuming you need lines 20 to 40,

sed -n '20,40p;41q' file_name

or

awk 'FNR>=20 && FNR<=40' file_name

What about:

tail -n +347340107 filename | head -n 100

I didn't test it, but I think that would work.


I am surprised only one other answer (by Ramana Reddy) suggested to add line numbers to the output. The following searches for the required line number and colours the output.

file=FILE
lineno=LINENO
wb="107"; bf="30;1"; rb="101"; yb="103"
cat -n ${file} | { GREP_COLORS="se=${wb};${bf}:cx=${wb};${bf}:ms=${rb};${bf}:sl=${yb};${bf}" grep --color -C 10 "^[[:space:]]\\+${lineno}[[:space:]]"; }

# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3,  efficient on large files 

method 3 efficient on large files

fastest way to display specific lines


Use

x=`cat -n <file> | grep <match> | awk '{print $1}'`

Here you will get the line number where the match occurred.

Now you can use the following command to print 100 lines

awk -v var="$x" 'NR>=var && NR<=var+100{print}' <file>

or you can use "sed" as well

sed -n "${x},${x+100}p" <file>

Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?

Examples related to unix

Docker CE on RHEL - Requires: container-selinux >= 2.9 What does `set -x` do? How to find files modified in last x minutes (find -mmin does not work as expected) sudo: npm: command not found How to sort a file in-place How to read a .properties file which contains keys that have a period character using Shell script gpg decryption fails with no secret key error Loop through a comma-separated shell variable Best way to find os name and version in Unix/Linux platform Resource u'tokenizers/punkt/english.pickle' not found

Examples related to text

Difference between opening a file in binary vs text How do I center text vertically and horizontally in Flutter? How to `wget` a list of URLs in a text file? Convert txt to csv python script Reading local text file into a JavaScript array Python: How to increase/reduce the fontsize of x and y tick labels? How can I insert a line break into a <Text> component in React Native? How to split large text file in windows? Copy text from nano editor to shell Atom menu is missing. How do I re-enable