How can I extract a predetermined range of lines from a text file on Unix

Question

I have a  23000 line SQL dump containing several databases worth of data  I need to extract a certain section of this file  i e  the data for a single database  and place it in a new file  I know both the start and end line numbers of the data that I want   Does anyone know a Unix command  or series of commands  to extract all lines from a file between say line 16224 and 16482 and then redirect them into a new file

User · Answer

sed -n  16224 16482p   lt  dump sql

User · Answer

Just benchmarking 3 solutions given above, that works to me:

awk
sed
"head+tail"

Credits on the 3 solutions goes to:

@boxxar
@avandeursen
@wds
@manveru
@sibaz
@SOFe
@fedorqui 'SO stop harming'
@Robin A. Meade

I'm using a huge file I find in my server:

# wc fo2debug.1.log
   10421186    19448208 38795491134 fo2debug.1.log

38 Gb in 10.4 million lines.

And yes, I have a logrotate problem. : ))

Make your bets!

Getting 256 lines from the beginning of the file.

# time sed -n '1001,1256p;1256q' fo2debug.1.log | wc -l
256

real    0m0,003s
user    0m0,000s
sys     0m0,004s

# time head -1256 fo2debug.1.log | tail -n +1001 | wc -l
256

real    0m0,003s
user    0m0,006s
sys     0m0,000s

# time awk 'NR==1001, NR==1256; NR==1256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,002s
user    0m0,004s
sys     0m0,000s

Awk won. Technical tie in second place between sed and "head+tail".

Getting 256 lines at the end of the first third of the file.

# time sed -n '3473001,3473256p;3473256q' fo2debug.1.log | wc -l
256

real    0m0,265s
user    0m0,242s
sys     0m0,024s

# time head -3473256 fo2debug.1.log | tail -n +3473001 | wc -l
256

real    0m0,308s
user    0m0,313s
sys     0m0,145s

# time awk 'NR==3473001, NR==3473256; NR==3473256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,393s
user    0m0,326s
sys     0m0,068s

Sed won. Followed by "head+tail" and, finally, awk.

Getting 256 lines at the end of the second third of the file.

# time sed -n '6947001,6947256p;6947256q' fo2debug.1.log | wc -l
A256

real    0m0,525s
user    0m0,462s
sys     0m0,064s

# time head -6947256 fo2debug.1.log | tail -n +6947001 | wc -l
256

real    0m0,615s
user    0m0,488s
sys     0m0,423s

# time awk 'NR==6947001, NR==6947256; NR==6947256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,779s
user    0m0,650s
sys     0m0,130s

Same results.

Sed won. Followed by "head+tail" and, finally, awk.

Getting 256 lines near the end of the file.

# time sed -n '10420001,10420256p;10420256q' fo2debug.1.log | wc -l
256

real    1m50,017s
user    0m12,735s
sys     0m22,926s

# time head -10420256 fo2debug.1.log | tail -n +10420001 | wc -l
256

real    1m48,269s
user    0m42,404s
sys     0m51,015s

# time awk 'NR==10420001, NR==10420256; NR==10420256 {exit}' fo2debug.1.log | wc -l
256

real    1m49,106s
user    0m12,322s
sys     0m18,576s

And suddenly, a twist!

"Head+tail" won. Followed by awk and, finally, sed.

(some hours later...)

Sorry guys!

My analysis above ends up being an example of a basic flaw in doing an analysis.

The flaw is not knowing in depth the resources used for the analysis.

In this case, I used a log file to analyze the performance of a search for a certain number of lines within it.

Using 3 different techniques, searches were made at different points in the file, comparing the performance of the techniques at each point and checking whether the results varied depending on the point in the file where the search was made.

My mistake was to assume that there was a certain homogeneity of content in the log file.

The reality is that long lines appear more frequently at the end of the file.

Thus, the apparent conclusion that longer searches (closer to the end of the file) are better with a given technique, may be biased. In fact, this technique may be better when dealing with longer lines. What remains to be confirmed.

User · Answer

sed -n  16224 16482 p  orig-data-file  gt  new-file   Where 16224 16482 are the start line number and end line number  inclusive   This is 1-indexed   -n suppresses echoing the input as output  which you clearly don t want  the numbers indicate the range of lines to make the following command operate on  the command p prints out the relevant lines

User · Answer

print section of file based on line numbers  sed -n  16224  16482p                  method 1  sed  16224 16482 d                    method 2

User · Answer

Quite simple using head tail  head -16482 in sql   tail -258  gt  out sql  using sed  sed -n  16224 16482p  in sql  gt  out sql  using awk  awk  NR gt  16224 amp  amp NR lt  16482  in sql  gt  out sql

User · Answer

You could use  vi  and then the following command    16224 16482w  tmp some-file   Alternatively    cat file   head -n 16482   tail -n 258   EDIT - Just to add explanation  you use head -n 16482 to display first 16482 lines then use tail -n 258 to get last 258 lines out of the first output

User · Answer

You could use  vi  and then the following command    16224 16482w  tmp some-file   Alternatively    cat file   head -n 16482   tail -n 258   EDIT - Just to add explanation  you use head -n 16482 to display first 16482 lines then use tail -n 258 to get last 258 lines out of the first output

User · Answer

cat dump txt   head -16224   tail -258   should do the trick  The downside of this approach is that you need to do the arithmetic to determine the argument for tail and to account for whether you want the  between  to include the ending line or not

User · Answer

I was looking for an answer to this but I had to end up writing my own code which worked. None of the answers above were satisfactory. Consider you have very large file and have certain line numbers that you want to print out but the numbers are not in order. You can do the following:

My relatively large file for letter in {a..k} ; do echo $letter; done | cat -n > myfile.txt

Specific line numbers I want: shuf -i 1-11 -n 4 > line_numbers_I_want.txt

To print these line numbers, do the following. awk '{system("head myfile.txt -n " $0 " | tail -n 1")}' line_numbers_I_want.txt

What the above does is to head the n line then take the last line using tail

If you want your line numbers in order, sort ( is -n numeric sort) first then get the lines.

cat line_numbers_I_want.txt | sort -n | awk '{system("head myfile.txt -n " $0 " | tail -n 1")}'

User · Answer

Using ed   ed -s infile  lt  lt  lt  16224 16482p    -s suppresses diagnostic output  the actual commands are in a here-string  Specifically  16224 16482p runs the p  print  command on the desired line address range

User · Answer

Quite simple using head tail  head -16482 in sql   tail -258  gt  out sql  using sed  sed -n  16224 16482p  in sql  gt  out sql  using awk  awk  NR gt  16224 amp  amp NR lt  16482  in sql  gt  out sql

User · Answer

You could use  vi  and then the following command    16224 16482w  tmp some-file   Alternatively    cat file   head -n 16482   tail -n 258   EDIT - Just to add explanation  you use head -n 16482 to display first 16482 lines then use tail -n 258 to get last 258 lines out of the first output

User · Answer

Quite simple using head tail  head -16482 in sql   tail -258  gt  out sql  using sed  sed -n  16224 16482p  in sql  gt  out sql  using awk  awk  NR gt  16224 amp  amp NR lt  16482  in sql  gt  out sql

User · Answer

perl -ne  print if 16224  16482  file txt  gt  new file txt

User · Answer

Even we can do this to check at command line   cat filename sed  n1 n2 d   gt  abc txt   For Example   cat foo pl sed  100 200 d   gt  abc txt

User · Answer

I was about to post the head tail trick  but actually I d probably just fire up emacs   -    esc-x goto-line ret 16224 mark  ctrl-space  esc-x goto-line ret 16482 esc-w   open the new output file  ctl-y save  Let s me see what s happening

User · Answer

sed -n  16224 16482 p  orig-data-file  gt  new-file   Where 16224 16482 are the start line number and end line number  inclusive   This is 1-indexed   -n suppresses echoing the input as output  which you clearly don t want  the numbers indicate the range of lines to make the following command operate on  the command p prints out the relevant lines

User · Answer

I was looking for an answer to this but I had to end up writing my own code which worked. None of the answers above were satisfactory. Consider you have very large file and have certain line numbers that you want to print out but the numbers are not in order. You can do the following:

My relatively large file for letter in {a..k} ; do echo $letter; done | cat -n > myfile.txt

Specific line numbers I want: shuf -i 1-11 -n 4 > line_numbers_I_want.txt

To print these line numbers, do the following. awk '{system("head myfile.txt -n " $0 " | tail -n 1")}' line_numbers_I_want.txt

What the above does is to head the n line then take the last line using tail

If you want your line numbers in order, sort ( is -n numeric sort) first then get the lines.

cat line_numbers_I_want.txt | sort -n | awk '{system("head myfile.txt -n " $0 " | tail -n 1")}'

User · Answer

I was about to post the head tail trick  but actually I d probably just fire up emacs   -    esc-x goto-line ret 16224 mark  ctrl-space  esc-x goto-line ret 16482 esc-w   open the new output file  ctl-y save  Let s me see what s happening

User · Answer

perl -ne  print if 16224  16482  file txt  gt  new file txt

User · Answer

print section of file based on line numbers  sed -n  16224  16482p                  method 1  sed  16224 16482 d                    method 2

User · Answer

perl -ne  print if 16224  16482  file txt  gt  new file txt

User · Answer

I was about to post the head tail trick  but actually I d probably just fire up emacs   -    esc-x goto-line ret 16224 mark  ctrl-space  esc-x goto-line ret 16482 esc-w   open the new output file  ctl-y save  Let s me see what s happening

User · Answer

Even we can do this to check at command line   cat filename sed  n1 n2 d   gt  abc txt   For Example   cat foo pl sed  100 200 d   gt  abc txt

User · Answer

print section of file based on line numbers  sed -n  16224  16482p                  method 1  sed  16224 16482 d                    method 2

User · Answer

I wanted to do the same thing from a script using a variable and achieved it by putting quotes around the $variable to separate the variable name from the p:

sed -n "$first","$count"p imagelist.txt >"$imageblock"

I wanted to split a list into separate folders and found the initial question and answer a useful step. (split command not an option on the old os I have to port code to).

User · Answer

sed -n  16224 16482p   lt  dump sql

User · Answer

There is another approach with awk   awk  NR  16224  NR  16482  file   If the file is huge  it can be good to exit after reading the last desired line  This way  it won t read the following lines unnecessarily   awk  NR  16224  NR  16482-1  NR  16482  print  exit   file  awk  NR  16224  NR  16482  NR  16482  exit   file

User · Answer

You could use  vi  and then the following command    16224 16482w  tmp some-file   Alternatively    cat file   head -n 16482   tail -n 258   EDIT - Just to add explanation  you use head -n 16482 to display first 16482 lines then use tail -n 258 to get last 258 lines out of the first output

User · Answer

Quick and dirty   head -16428  lt  file in   tail -259  gt  file out   Probably not the best way to do it but it should work   BTW  259   16482-16224 1

User · Answer

sed -n  16224 16482 p  orig-data-file  gt  new-file   Where 16224 16482 are the start line number and end line number  inclusive   This is 1-indexed   -n suppresses echoing the input as output  which you clearly don t want  the numbers indicate the range of lines to make the following command operate on  the command p prints out the relevant lines

User · Answer

I was looking for an answer to this but I had to end up writing my own code which worked. None of the answers above were satisfactory. Consider you have very large file and have certain line numbers that you want to print out but the numbers are not in order. You can do the following:

My relatively large file for letter in {a..k} ; do echo $letter; done | cat -n > myfile.txt

Specific line numbers I want: shuf -i 1-11 -n 4 > line_numbers_I_want.txt

To print these line numbers, do the following. awk '{system("head myfile.txt -n " $0 " | tail -n 1")}' line_numbers_I_want.txt

What the above does is to head the n line then take the last line using tail

If you want your line numbers in order, sort ( is -n numeric sort) first then get the lines.

cat line_numbers_I_want.txt | sort -n | awk '{system("head myfile.txt -n " $0 " | tail -n 1")}'

User · Answer

cat dump txt   head -16224   tail -258   should do the trick  The downside of this approach is that you need to do the arithmetic to determine the argument for tail and to account for whether you want the  between  to include the ending line or not

User · Answer

I was about to post the head tail trick  but actually I d probably just fire up emacs   -    esc-x goto-line ret 16224 mark  ctrl-space  esc-x goto-line ret 16482 esc-w   open the new output file  ctl-y save  Let s me see what s happening

User · Answer

Quick and dirty   head -16428  lt  file in   tail -259  gt  file out   Probably not the best way to do it but it should work   BTW  259   16482-16224 1

User · Answer

sed -n  16224 16482 p  orig-data-file  gt  new-file   Where 16224 16482 are the start line number and end line number  inclusive   This is 1-indexed   -n suppresses echoing the input as output  which you clearly don t want  the numbers indicate the range of lines to make the following command operate on  the command p prints out the relevant lines

User · Answer

Just benchmarking 3 solutions given above, that works to me:

awk
sed
"head+tail"

Credits on the 3 solutions goes to:

@boxxar
@avandeursen
@wds
@manveru
@sibaz
@SOFe
@fedorqui 'SO stop harming'
@Robin A. Meade

I'm using a huge file I find in my server:

# wc fo2debug.1.log
   10421186    19448208 38795491134 fo2debug.1.log

38 Gb in 10.4 million lines.

And yes, I have a logrotate problem. : ))

Make your bets!

Getting 256 lines from the beginning of the file.

# time sed -n '1001,1256p;1256q' fo2debug.1.log | wc -l
256

real    0m0,003s
user    0m0,000s
sys     0m0,004s

# time head -1256 fo2debug.1.log | tail -n +1001 | wc -l
256

real    0m0,003s
user    0m0,006s
sys     0m0,000s

# time awk 'NR==1001, NR==1256; NR==1256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,002s
user    0m0,004s
sys     0m0,000s

Awk won. Technical tie in second place between sed and "head+tail".

Getting 256 lines at the end of the first third of the file.

# time sed -n '3473001,3473256p;3473256q' fo2debug.1.log | wc -l
256

real    0m0,265s
user    0m0,242s
sys     0m0,024s

# time head -3473256 fo2debug.1.log | tail -n +3473001 | wc -l
256

real    0m0,308s
user    0m0,313s
sys     0m0,145s

# time awk 'NR==3473001, NR==3473256; NR==3473256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,393s
user    0m0,326s
sys     0m0,068s

Sed won. Followed by "head+tail" and, finally, awk.

Getting 256 lines at the end of the second third of the file.

# time sed -n '6947001,6947256p;6947256q' fo2debug.1.log | wc -l
A256

real    0m0,525s
user    0m0,462s
sys     0m0,064s

# time head -6947256 fo2debug.1.log | tail -n +6947001 | wc -l
256

real    0m0,615s
user    0m0,488s
sys     0m0,423s

# time awk 'NR==6947001, NR==6947256; NR==6947256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,779s
user    0m0,650s
sys     0m0,130s

Same results.

Sed won. Followed by "head+tail" and, finally, awk.

Getting 256 lines near the end of the file.

# time sed -n '10420001,10420256p;10420256q' fo2debug.1.log | wc -l
256

real    1m50,017s
user    0m12,735s
sys     0m22,926s

# time head -10420256 fo2debug.1.log | tail -n +10420001 | wc -l
256

real    1m48,269s
user    0m42,404s
sys     0m51,015s

# time awk 'NR==10420001, NR==10420256; NR==10420256 {exit}' fo2debug.1.log | wc -l
256

real    1m49,106s
user    0m12,322s
sys     0m18,576s

And suddenly, a twist!

"Head+tail" won. Followed by awk and, finally, sed.

(some hours later...)

Sorry guys!

My analysis above ends up being an example of a basic flaw in doing an analysis.

The flaw is not knowing in depth the resources used for the analysis.

In this case, I used a log file to analyze the performance of a search for a certain number of lines within it.

Using 3 different techniques, searches were made at different points in the file, comparing the performance of the techniques at each point and checking whether the results varied depending on the point in the file where the search was made.

My mistake was to assume that there was a certain homogeneity of content in the log file.

The reality is that long lines appear more frequently at the end of the file.

Thus, the apparent conclusion that longer searches (closer to the end of the file) are better with a given technique, may be biased. In fact, this technique may be better when dealing with longer lines. What remains to be confirmed.

User · Answer

I wrote a small bash script that you can run from your command line, so long as you update your PATH to include its directory (or you can place it in a directory that is already contained in the PATH).

Usage: $ pinch filename start-line end-line

#!/bin/bash
# Display line number ranges of a file to the terminal.
# Usage: $ pinch filename start-line end-line
# By Evan J. Coon

FILENAME=$1
START=$2
END=$3

ERROR="[PINCH ERROR]"

# Check that the number of arguments is 3
if [ $# -lt 3 ]; then
    echo "$ERROR Need three arguments: Filename Start-line End-line"
    exit 1
fi

# Check that the file exists.
if [ ! -f "$FILENAME" ]; then
    echo -e "$ERROR File does not exist. \n\t$FILENAME"
    exit 1
fi

# Check that start-line is not greater than end-line
if [ "$START" -gt "$END" ]; then
    echo -e "$ERROR Start line is greater than End line."
    exit 1
fi

# Check that start-line is positive.
if [ "$START" -lt 0 ]; then
    echo -e "$ERROR Start line is less than 0."
    exit 1
fi

# Check that end-line is positive.
if [ "$END" -lt 0 ]; then
    echo -e "$ERROR End line is less than 0."
    exit 1
fi

NUMOFLINES=$(wc -l < "$FILENAME")

# Check that end-line is not greater than the number of lines in the file.
if [ "$END" -gt "$NUMOFLINES" ]; then
    echo -e "$ERROR End line is greater than number of lines in file."
    exit 1
fi

# The distance from the end of the file to end-line
ENDDIFF=$(( NUMOFLINES - END ))

# For larger files, this will run more quickly. If the distance from the
# end of the file to the end-line is less than the distance from the
# start of the file to the start-line, then start pinching from the
# bottom as opposed to the top.
if [ "$START" -lt "$ENDDIFF" ]; then
    < "$FILENAME" head -n $END | tail -n +$START
else
    < "$FILENAME" tail -n +$START | head -n $(( END-START+1 ))
fi

# Success
exit 0

User · Answer

Standing on the shoulders of boxxar  I like this   sed -n   lt first line gt   p  lt last line gt q  input   e g   sed -n  16224  p 16482q  input   The   means  last line   so the first command makes sed print all lines starting with line 16224 and the second command makes sed quit after printing line 16428   Adding 1 for the q-range in boxxar s solution does not seem to be necessary    I like this variant because I don t need to specify the ending line number twice  And I measured that using   does not have detrimental effects on performance

User · Answer

I was about to post the head tail trick  but actually I d probably just fire up emacs   -    esc-x goto-line ret 16224 mark  ctrl-space  esc-x goto-line ret 16482 esc-w   open the new output file  ctl-y save  Let s me see what s happening

User · Answer

You could use  vi  and then the following command    16224 16482w  tmp some-file   Alternatively    cat file   head -n 16482   tail -n 258   EDIT - Just to add explanation  you use head -n 16482 to display first 16482 lines then use tail -n 258 to get last 258 lines out of the first output

User · Answer

print section of file based on line numbers  sed -n  16224  16482p                  method 1  sed  16224 16482 d                    method 2

User · Answer

Just benchmarking 3 solutions given above, that works to me:

awk
sed
"head+tail"

Credits on the 3 solutions goes to:

@boxxar
@avandeursen
@wds
@manveru
@sibaz
@SOFe
@fedorqui 'SO stop harming'
@Robin A. Meade

I'm using a huge file I find in my server:

# wc fo2debug.1.log
   10421186    19448208 38795491134 fo2debug.1.log

38 Gb in 10.4 million lines.

And yes, I have a logrotate problem. : ))

Make your bets!

Getting 256 lines from the beginning of the file.

# time sed -n '1001,1256p;1256q' fo2debug.1.log | wc -l
256

real    0m0,003s
user    0m0,000s
sys     0m0,004s

# time head -1256 fo2debug.1.log | tail -n +1001 | wc -l
256

real    0m0,003s
user    0m0,006s
sys     0m0,000s

# time awk 'NR==1001, NR==1256; NR==1256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,002s
user    0m0,004s
sys     0m0,000s

Awk won. Technical tie in second place between sed and "head+tail".

Getting 256 lines at the end of the first third of the file.

# time sed -n '3473001,3473256p;3473256q' fo2debug.1.log | wc -l
256

real    0m0,265s
user    0m0,242s
sys     0m0,024s

# time head -3473256 fo2debug.1.log | tail -n +3473001 | wc -l
256

real    0m0,308s
user    0m0,313s
sys     0m0,145s

# time awk 'NR==3473001, NR==3473256; NR==3473256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,393s
user    0m0,326s
sys     0m0,068s

Sed won. Followed by "head+tail" and, finally, awk.

Getting 256 lines at the end of the second third of the file.

# time sed -n '6947001,6947256p;6947256q' fo2debug.1.log | wc -l
A256

real    0m0,525s
user    0m0,462s
sys     0m0,064s

# time head -6947256 fo2debug.1.log | tail -n +6947001 | wc -l
256

real    0m0,615s
user    0m0,488s
sys     0m0,423s

# time awk 'NR==6947001, NR==6947256; NR==6947256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,779s
user    0m0,650s
sys     0m0,130s

Same results.

Sed won. Followed by "head+tail" and, finally, awk.

Getting 256 lines near the end of the file.

# time sed -n '10420001,10420256p;10420256q' fo2debug.1.log | wc -l
256

real    1m50,017s
user    0m12,735s
sys     0m22,926s

# time head -10420256 fo2debug.1.log | tail -n +10420001 | wc -l
256

real    1m48,269s
user    0m42,404s
sys     0m51,015s

# time awk 'NR==10420001, NR==10420256; NR==10420256 {exit}' fo2debug.1.log | wc -l
256

real    1m49,106s
user    0m12,322s
sys     0m18,576s

And suddenly, a twist!

"Head+tail" won. Followed by awk and, finally, sed.

(some hours later...)

Sorry guys!

My analysis above ends up being an example of a basic flaw in doing an analysis.

The flaw is not knowing in depth the resources used for the analysis.

In this case, I used a log file to analyze the performance of a search for a certain number of lines within it.

Using 3 different techniques, searches were made at different points in the file, comparing the performance of the techniques at each point and checking whether the results varied depending on the point in the file where the search was made.

My mistake was to assume that there was a certain homogeneity of content in the log file.

The reality is that long lines appear more frequently at the end of the file.

Thus, the apparent conclusion that longer searches (closer to the end of the file) are better with a given technique, may be biased. In fact, this technique may be better when dealing with longer lines. What remains to be confirmed.

User · Answer

sed -n  16224 16482p 16483q  filename  gt  newfile   From the sed manual      p -        Print out the pattern space  to the standard output   This command is usually only used in conjunction with the -n command-line option       n -       If auto-print is not disabled  print the pattern space  then  regardless  replace the pattern space with the next line of input  If   there is no more input then sed exits without processing any more   commands       q -   Exit sed without processing any more commands or input    Note that the current pattern space is printed if auto-print is not disabled with the -n option    and     Addresses in a sed script can be in any of the following forms       number       Specifying a line number will match only that line in the input       An address range can be specified by specifying two addresses   separated by a comma      An address range matches lines starting from   where the first address matches  and continues until the second   address matches  inclusively

User · Answer

print section of file based on line numbers  sed -n  16224  16482p                  method 1  sed  16224 16482 d                    method 2

User · Answer

Using ruby   ruby -ne  puts                if     gt   32613500  amp  amp      lt   32614500   lt  GND rdf  gt  GND extract rdf

User · Answer

sed -n  16224 16482p 16483q  filename  gt  newfile   From the sed manual      p -        Print out the pattern space  to the standard output   This command is usually only used in conjunction with the -n command-line option       n -       If auto-print is not disabled  print the pattern space  then  regardless  replace the pattern space with the next line of input  If   there is no more input then sed exits without processing any more   commands       q -   Exit sed without processing any more commands or input    Note that the current pattern space is printed if auto-print is not disabled with the -n option    and     Addresses in a sed script can be in any of the following forms       number       Specifying a line number will match only that line in the input       An address range can be specified by specifying two addresses   separated by a comma      An address range matches lines starting from   where the first address matches  and continues until the second   address matches  inclusively

User · Answer

print section of file based on line numbers  sed -n  16224  16482p                  method 1  sed  16224 16482 d                    method 2

User · Answer

sed -n  16224 16482p 16483q  filename  gt  newfile   From the sed manual      p -        Print out the pattern space  to the standard output   This command is usually only used in conjunction with the -n command-line option       n -       If auto-print is not disabled  print the pattern space  then  regardless  replace the pattern space with the next line of input  If   there is no more input then sed exits without processing any more   commands       q -   Exit sed without processing any more commands or input    Note that the current pattern space is printed if auto-print is not disabled with the -n option    and     Addresses in a sed script can be in any of the following forms       number       Specifying a line number will match only that line in the input       An address range can be specified by specifying two addresses   separated by a comma      An address range matches lines starting from   where the first address matches  and continues until the second   address matches  inclusively

User · Answer

The -n in the accept answers work  Here s another way in case you re inclined   cat  filename   sed    linenum p d     This does the following    pipe in the contents of a file  or feed in the text however you want   sed selects the given line  prints it d is required to delete lines  otherwise sed will assume all lines will eventually be printed  i e   without the d  you will get all lines printed by the selected line printed twice because you have the   linenum p part asking for it to be printed  I m pretty sure the -n is basically doing the same thing as the d here

User · Answer

The -n in the accept answers work  Here s another way in case you re inclined   cat  filename   sed    linenum p d     This does the following    pipe in the contents of a file  or feed in the text however you want   sed selects the given line  prints it d is required to delete lines  otherwise sed will assume all lines will eventually be printed  i e   without the d  you will get all lines printed by the selected line printed twice because you have the   linenum p part asking for it to be printed  I m pretty sure the -n is basically doing the same thing as the d here

User · Answer

sed -n  16224 16482 p  orig-data-file  gt  new-file   Where 16224 16482 are the start line number and end line number  inclusive   This is 1-indexed   -n suppresses echoing the input as output  which you clearly don t want  the numbers indicate the range of lines to make the following command operate on  the command p prints out the relevant lines

User · Answer

sed -n  16224 16482p   lt  dump sql

User · Answer

I was about to post the head tail trick  but actually I d probably just fire up emacs   -    esc-x goto-line ret 16224 mark  ctrl-space  esc-x goto-line ret 16482 esc-w   open the new output file  ctl-y save  Let s me see what s happening

User · Answer

Quick and dirty   head -16428  lt  file in   tail -259  gt  file out   Probably not the best way to do it but it should work   BTW  259   16482-16224 1

User · Answer

Quick and dirty   head -16428  lt  file in   tail -259  gt  file out   Probably not the best way to do it but it should work   BTW  259   16482-16224 1

User · Answer

cat dump txt   head -16224   tail -258   should do the trick  The downside of this approach is that you need to do the arithmetic to determine the argument for tail and to account for whether you want the  between  to include the ending line or not

User · Answer

sed -n  16224 16482 p  orig-data-file  gt  new-file   Where 16224 16482 are the start line number and end line number  inclusive   This is 1-indexed   -n suppresses echoing the input as output  which you clearly don t want  the numbers indicate the range of lines to make the following command operate on  the command p prints out the relevant lines

User · Answer

I wanted to do the same thing from a script using a variable and achieved it by putting quotes around the $variable to separate the variable name from the p:

sed -n "$first","$count"p imagelist.txt >"$imageblock"

I wanted to split a list into separate folders and found the initial question and answer a useful step. (split command not an option on the old os I have to port code to).

User · Answer

sed -n  16224 16482p   lt  dump sql

User · Answer

sed -n  16224 16482p 16483q  filename  gt  newfile   From the sed manual      p -        Print out the pattern space  to the standard output   This command is usually only used in conjunction with the -n command-line option       n -       If auto-print is not disabled  print the pattern space  then  regardless  replace the pattern space with the next line of input  If   there is no more input then sed exits without processing any more   commands       q -   Exit sed without processing any more commands or input    Note that the current pattern space is printed if auto-print is not disabled with the -n option    and     Addresses in a sed script can be in any of the following forms       number       Specifying a line number will match only that line in the input       An address range can be specified by specifying two addresses   separated by a comma      An address range matches lines starting from   where the first address matches  and continues until the second   address matches  inclusively

User · Answer

sed -n  16224 16482 p  orig-data-file  gt  new-file   Where 16224 16482 are the start line number and end line number  inclusive   This is 1-indexed   -n suppresses echoing the input as output  which you clearly don t want  the numbers indicate the range of lines to make the following command operate on  the command p prints out the relevant lines

User · Answer

I wanted to do the same thing from a script using a variable and achieved it by putting quotes around the $variable to separate the variable name from the p:

sed -n "$first","$count"p imagelist.txt >"$imageblock"

I wanted to split a list into separate folders and found the initial question and answer a useful step. (split command not an option on the old os I have to port code to).

User · Answer

Since we are talking about extracting lines of text from a text file, I will give an special case where you want to extract all lines that match a certain pattern.

myfile content:
=====================
line1 not needed
line2 also discarded
[Data]
first data line
second data line
=====================
sed -n '/Data/,$p' myfile

Will print the [Data] line and the remaining. If you want the text from line1 to the pattern, you type: sed -n '1,/Data/p' myfile. Furthermore, if you know two pattern (better be unique in your text), both the beginning and end line of the range can be specified with matches.

sed -n '/BEGIN_MARK/,/END_MARK/p' myfile

User · Answer

I wrote a Haskell program called splitter that does exactly this: have a read through my release blog post.

You can use the program as follows:

$ cat somefile | splitter 16224-16482

And that is all that there is to it. You will need Haskell to install it. Just:

$ cabal install splitter

And you are done. I hope that you find this program useful.

User · Answer

Standing on the shoulders of boxxar  I like this   sed -n   lt first line gt   p  lt last line gt q  input   e g   sed -n  16224  p 16482q  input   The   means  last line   so the first command makes sed print all lines starting with line 16224 and the second command makes sed quit after printing line 16428   Adding 1 for the q-range in boxxar s solution does not seem to be necessary    I like this variant because I don t need to specify the ending line number twice  And I measured that using   does not have detrimental effects on performance

User · Answer

Even we can do this to check at command line   cat filename sed  n1 n2 d   gt  abc txt   For Example   cat foo pl sed  100 200 d   gt  abc txt

User · Answer

I wrote a Haskell program called splitter that does exactly this: have a read through my release blog post.

You can use the program as follows:

$ cat somefile | splitter 16224-16482

And that is all that there is to it. You will need Haskell to install it. Just:

$ cabal install splitter

And you are done. I hope that you find this program useful.

User · Answer

There is another approach with awk   awk  NR  16224  NR  16482  file   If the file is huge  it can be good to exit after reading the last desired line  This way  it won t read the following lines unnecessarily   awk  NR  16224  NR  16482-1  NR  16482  print  exit   file  awk  NR  16224  NR  16482  NR  16482  exit   file

User · Answer

sed -n  16224 16482p   lt  dump sql

User · Answer

Using ruby   ruby -ne  puts                if     gt   32613500  amp  amp      lt   32614500   lt  GND rdf  gt  GND extract rdf

User · Answer

I was looking for an answer to this but I had to end up writing my own code which worked. None of the answers above were satisfactory. Consider you have very large file and have certain line numbers that you want to print out but the numbers are not in order. You can do the following:

My relatively large file for letter in {a..k} ; do echo $letter; done | cat -n > myfile.txt

Specific line numbers I want: shuf -i 1-11 -n 4 > line_numbers_I_want.txt

To print these line numbers, do the following. awk '{system("head myfile.txt -n " $0 " | tail -n 1")}' line_numbers_I_want.txt

What the above does is to head the n line then take the last line using tail

If you want your line numbers in order, sort ( is -n numeric sort) first then get the lines.

cat line_numbers_I_want.txt | sort -n | awk '{system("head myfile.txt -n " $0 " | tail -n 1")}'

User · Answer

sed -n  16224 16482p   lt  dump sql

User · Answer

There is another approach with awk   awk  NR  16224  NR  16482  file   If the file is huge  it can be good to exit after reading the last desired line  This way  it won t read the following lines unnecessarily   awk  NR  16224  NR  16482-1  NR  16482  print  exit   file  awk  NR  16224  NR  16482  NR  16482  exit   file

User · Answer

Using ed   ed -s infile  lt  lt  lt  16224 16482p    -s suppresses diagnostic output  the actual commands are in a here-string  Specifically  16224 16482p runs the p  print  command on the desired line address range

User · Answer

The -n in the accept answers work  Here s another way in case you re inclined   cat  filename   sed    linenum p d     This does the following    pipe in the contents of a file  or feed in the text however you want   sed selects the given line  prints it d is required to delete lines  otherwise sed will assume all lines will eventually be printed  i e   without the d  you will get all lines printed by the selected line printed twice because you have the   linenum p part asking for it to be printed  I m pretty sure the -n is basically doing the same thing as the d here

User · Answer

I wrote a small bash script that you can run from your command line, so long as you update your PATH to include its directory (or you can place it in a directory that is already contained in the PATH).

Usage: $ pinch filename start-line end-line

#!/bin/bash
# Display line number ranges of a file to the terminal.
# Usage: $ pinch filename start-line end-line
# By Evan J. Coon

FILENAME=$1
START=$2
END=$3

ERROR="[PINCH ERROR]"

# Check that the number of arguments is 3
if [ $# -lt 3 ]; then
    echo "$ERROR Need three arguments: Filename Start-line End-line"
    exit 1
fi

# Check that the file exists.
if [ ! -f "$FILENAME" ]; then
    echo -e "$ERROR File does not exist. \n\t$FILENAME"
    exit 1
fi

# Check that start-line is not greater than end-line
if [ "$START" -gt "$END" ]; then
    echo -e "$ERROR Start line is greater than End line."
    exit 1
fi

# Check that start-line is positive.
if [ "$START" -lt 0 ]; then
    echo -e "$ERROR Start line is less than 0."
    exit 1
fi

# Check that end-line is positive.
if [ "$END" -lt 0 ]; then
    echo -e "$ERROR End line is less than 0."
    exit 1
fi

NUMOFLINES=$(wc -l < "$FILENAME")

# Check that end-line is not greater than the number of lines in the file.
if [ "$END" -gt "$NUMOFLINES" ]; then
    echo -e "$ERROR End line is greater than number of lines in file."
    exit 1
fi

# The distance from the end of the file to end-line
ENDDIFF=$(( NUMOFLINES - END ))

# For larger files, this will run more quickly. If the distance from the
# end of the file to the end-line is less than the distance from the
# start of the file to the start-line, then start pinching from the
# bottom as opposed to the top.
if [ "$START" -lt "$ENDDIFF" ]; then
    < "$FILENAME" head -n $END | tail -n +$START
else
    < "$FILENAME" tail -n +$START | head -n $(( END-START+1 ))
fi

# Success
exit 0

User · Answer

The -n in the accept answers work  Here s another way in case you re inclined   cat  filename   sed    linenum p d     This does the following    pipe in the contents of a file  or feed in the text however you want   sed selects the given line  prints it d is required to delete lines  otherwise sed will assume all lines will eventually be printed  i e   without the d  you will get all lines printed by the selected line printed twice because you have the   linenum p part asking for it to be printed  I m pretty sure the -n is basically doing the same thing as the d here

User · Answer

Since we are talking about extracting lines of text from a text file, I will give an special case where you want to extract all lines that match a certain pattern.

myfile content:
=====================
line1 not needed
line2 also discarded
[Data]
first data line
second data line
=====================
sed -n '/Data/,$p' myfile

Will print the [Data] line and the remaining. If you want the text from line1 to the pattern, you type: sed -n '1,/Data/p' myfile. Furthermore, if you know two pattern (better be unique in your text), both the beginning and end line of the range can be specified with matches.

sed -n '/BEGIN_MARK/,/END_MARK/p' myfile

User · Answer

sed -n  16224 16482p 16483q  filename  gt  newfile   From the sed manual      p -        Print out the pattern space  to the standard output   This command is usually only used in conjunction with the -n command-line option       n -       If auto-print is not disabled  print the pattern space  then  regardless  replace the pattern space with the next line of input  If   there is no more input then sed exits without processing any more   commands       q -   Exit sed without processing any more commands or input    Note that the current pattern space is printed if auto-print is not disabled with the -n option    and     Addresses in a sed script can be in any of the following forms       number       Specifying a line number will match only that line in the input       An address range can be specified by specifying two addresses   separated by a comma      An address range matches lines starting from   where the first address matches  and continues until the second   address matches  inclusively

User · Answer

I would use   awk  FNR  gt   16224  amp  amp  FNR  lt   16482  my file  gt  extracted txt   FNR contains the record  line  number of the line being read from the file

User · Answer

cat dump txt   head -16224   tail -258   should do the trick  The downside of this approach is that you need to do the arithmetic to determine the argument for tail and to account for whether you want the  between  to include the ending line or not

User · Answer

perl -ne  print if 16224  16482  file txt  gt  new file txt

User · Answer

print section of file based on line numbers  sed -n  16224  16482p                  method 1  sed  16224 16482 d                    method 2

User · Answer

Even we can do this to check at command line   cat filename sed  n1 n2 d   gt  abc txt   For Example   cat foo pl sed  100 200 d   gt  abc txt

User · Answer

sed -n  16224 16482p   lt  dump sql

User · Answer

Using ed   ed -s infile  lt  lt  lt  16224 16482p    -s suppresses diagnostic output  the actual commands are in a here-string  Specifically  16224 16482p runs the p  print  command on the desired line address range

User · Answer

I was about to post the head tail trick  but actually I d probably just fire up emacs   -    esc-x goto-line ret 16224 mark  ctrl-space  esc-x goto-line ret 16482 esc-w   open the new output file  ctl-y save  Let s me see what s happening

User · Answer

Quick and dirty   head -16428  lt  file in   tail -259  gt  file out   Probably not the best way to do it but it should work   BTW  259   16482-16224 1

User · Answer

I wrote a small bash script that you can run from your command line, so long as you update your PATH to include its directory (or you can place it in a directory that is already contained in the PATH).

Usage: $ pinch filename start-line end-line

#!/bin/bash
# Display line number ranges of a file to the terminal.
# Usage: $ pinch filename start-line end-line
# By Evan J. Coon

FILENAME=$1
START=$2
END=$3

ERROR="[PINCH ERROR]"

# Check that the number of arguments is 3
if [ $# -lt 3 ]; then
    echo "$ERROR Need three arguments: Filename Start-line End-line"
    exit 1
fi

# Check that the file exists.
if [ ! -f "$FILENAME" ]; then
    echo -e "$ERROR File does not exist. \n\t$FILENAME"
    exit 1
fi

# Check that start-line is not greater than end-line
if [ "$START" -gt "$END" ]; then
    echo -e "$ERROR Start line is greater than End line."
    exit 1
fi

# Check that start-line is positive.
if [ "$START" -lt 0 ]; then
    echo -e "$ERROR Start line is less than 0."
    exit 1
fi

# Check that end-line is positive.
if [ "$END" -lt 0 ]; then
    echo -e "$ERROR End line is less than 0."
    exit 1
fi

NUMOFLINES=$(wc -l < "$FILENAME")

# Check that end-line is not greater than the number of lines in the file.
if [ "$END" -gt "$NUMOFLINES" ]; then
    echo -e "$ERROR End line is greater than number of lines in file."
    exit 1
fi

# The distance from the end of the file to end-line
ENDDIFF=$(( NUMOFLINES - END ))

# For larger files, this will run more quickly. If the distance from the
# end of the file to the end-line is less than the distance from the
# start of the file to the start-line, then start pinching from the
# bottom as opposed to the top.
if [ "$START" -lt "$ENDDIFF" ]; then
    < "$FILENAME" head -n $END | tail -n +$START
else
    < "$FILENAME" tail -n +$START | head -n $(( END-START+1 ))
fi

# Success
exit 0

User · Answer

There is another approach with awk   awk  NR  16224  NR  16482  file   If the file is huge  it can be good to exit after reading the last desired line  This way  it won t read the following lines unnecessarily   awk  NR  16224  NR  16482-1  NR  16482  print  exit   file  awk  NR  16224  NR  16482  NR  16482  exit   file

User · Answer

cat dump txt   head -16224   tail -258   should do the trick  The downside of this approach is that you need to do the arithmetic to determine the argument for tail and to account for whether you want the  between  to include the ending line or not

User · Answer

Quite simple using head tail  head -16482 in sql   tail -258  gt  out sql  using sed  sed -n  16224 16482p  in sql  gt  out sql  using awk  awk  NR gt  16224 amp  amp NR lt  16482  in sql  gt  out sql

User · Answer

perl -ne  print if 16224  16482  file txt  gt  new file txt

User · Answer

I would use   awk  FNR  gt   16224  amp  amp  FNR  lt   16482  my file  gt  extracted txt   FNR contains the record  line  number of the line being read from the file

User · Answer

sed -n  16224 16482p 16483q  filename  gt  newfile   From the sed manual      p -        Print out the pattern space  to the standard output   This command is usually only used in conjunction with the -n command-line option       n -       If auto-print is not disabled  print the pattern space  then  regardless  replace the pattern space with the next line of input  If   there is no more input then sed exits without processing any more   commands       q -   Exit sed without processing any more commands or input    Note that the current pattern space is printed if auto-print is not disabled with the -n option    and     Addresses in a sed script can be in any of the following forms       number       Specifying a line number will match only that line in the input       An address range can be specified by specifying two addresses   separated by a comma      An address range matches lines starting from   where the first address matches  and continues until the second   address matches  inclusively

User · Answer

I was about to post the head tail trick  but actually I d probably just fire up emacs   -    esc-x goto-line ret 16224 mark  ctrl-space  esc-x goto-line ret 16482 esc-w   open the new output file  ctl-y save  Let s me see what s happening

User · Answer

I wrote a Haskell program called splitter that does exactly this: have a read through my release blog post.

You can use the program as follows:

$ cat somefile | splitter 16224-16482

And that is all that there is to it. You will need Haskell to install it. Just:

$ cabal install splitter

And you are done. I hope that you find this program useful.

User · Answer

cat dump txt   head -16224   tail -258   should do the trick  The downside of this approach is that you need to do the arithmetic to determine the argument for tail and to account for whether you want the  between  to include the ending line or not

User · Answer

cat dump txt   head -16224   tail -258   should do the trick  The downside of this approach is that you need to do the arithmetic to determine the argument for tail and to account for whether you want the  between  to include the ending line or not

User · Answer

You could use  vi  and then the following command    16224 16482w  tmp some-file   Alternatively    cat file   head -n 16482   tail -n 258   EDIT - Just to add explanation  you use head -n 16482 to display first 16482 lines then use tail -n 258 to get last 258 lines out of the first output

User · Answer

Standing on the shoulders of boxxar  I like this   sed -n   lt first line gt   p  lt last line gt q  input   e g   sed -n  16224  p 16482q  input   The   means  last line   so the first command makes sed print all lines starting with line 16224 and the second command makes sed quit after printing line 16428   Adding 1 for the q-range in boxxar s solution does not seem to be necessary    I like this variant because I don t need to specify the ending line number twice  And I measured that using   does not have detrimental effects on performance

User · Answer

Quite simple using head tail  head -16482 in sql   tail -258  gt  out sql  using sed  sed -n  16224 16482p  in sql  gt  out sql  using awk  awk  NR gt  16224 amp  amp NR lt  16482  in sql  gt  out sql

User · Answer

This might work for you  GNU sed    sed -ne  16224 16482w newfile  -e  16482q  file   or taking advantage of bash   sed -n   16224 16482w newfile n16482q  file

User · Answer

Since we are talking about extracting lines of text from a text file, I will give an special case where you want to extract all lines that match a certain pattern.

myfile content:
=====================
line1 not needed
line2 also discarded
[Data]
first data line
second data line
=====================
sed -n '/Data/,$p' myfile

Will print the [Data] line and the remaining. If you want the text from line1 to the pattern, you type: sed -n '1,/Data/p' myfile. Furthermore, if you know two pattern (better be unique in your text), both the beginning and end line of the range can be specified with matches.

sed -n '/BEGIN_MARK/,/END_MARK/p' myfile

User · Answer

Using ed   ed -s infile  lt  lt  lt  16224 16482p    -s suppresses diagnostic output  the actual commands are in a here-string  Specifically  16224 16482p runs the p  print  command on the desired line address range

User · Answer

Quick and dirty   head -16428  lt  file in   tail -259  gt  file out   Probably not the best way to do it but it should work   BTW  259   16482-16224 1

User · Answer

Quite simple using head tail  head -16482 in sql   tail -258  gt  out sql  using sed  sed -n  16224 16482p  in sql  gt  out sql  using awk  awk  NR gt  16224 amp  amp NR lt  16482  in sql  gt  out sql

User · Answer

perl -ne  print if 16224  16482  file txt  gt  new file txt

User · Answer

I wanted to do the same thing from a script using a variable and achieved it by putting quotes around the $variable to separate the variable name from the p:

sed -n "$first","$count"p imagelist.txt >"$imageblock"

I wanted to split a list into separate folders and found the initial question and answer a useful step. (split command not an option on the old os I have to port code to).

User · Answer

Since we are talking about extracting lines of text from a text file, I will give an special case where you want to extract all lines that match a certain pattern.

myfile content:
=====================
line1 not needed
line2 also discarded
[Data]
first data line
second data line
=====================
sed -n '/Data/,$p' myfile

Will print the [Data] line and the remaining. If you want the text from line1 to the pattern, you type: sed -n '1,/Data/p' myfile. Furthermore, if you know two pattern (better be unique in your text), both the beginning and end line of the range can be specified with matches.

sed -n '/BEGIN_MARK/,/END_MARK/p' myfile

User · Answer

cat dump txt   head -16224   tail -258   should do the trick  The downside of this approach is that you need to do the arithmetic to determine the argument for tail and to account for whether you want the  between  to include the ending line or not

User · Answer

Quick and dirty   head -16428  lt  file in   tail -259  gt  file out   Probably not the best way to do it but it should work   BTW  259   16482-16224 1

User · Answer

perl -ne  print if 16224  16482  file txt  gt  new file txt

User · Answer

Quick and dirty   head -16428  lt  file in   tail -259  gt  file out   Probably not the best way to do it but it should work   BTW  259   16482-16224 1

User · Answer

sed -n  16224 16482p   lt  dump sql

User · Answer

print section of file based on line numbers  sed -n  16224  16482p                  method 1  sed  16224 16482 d                    method 2

User · Answer

perl -ne  print if 16224  16482  file txt  gt  new file txt

User · Answer

sed -n  16224 16482 p  orig-data-file  gt  new-file   Where 16224 16482 are the start line number and end line number  inclusive   This is 1-indexed   -n suppresses echoing the input as output  which you clearly don t want  the numbers indicate the range of lines to make the following command operate on  the command p prints out the relevant lines

User · Answer

This might work for you  GNU sed    sed -ne  16224 16482w newfile  -e  16482q  file   or taking advantage of bash   sed -n   16224 16482w newfile n16482q  file

User · Answer

Using ruby   ruby -ne  puts                if     gt   32613500  amp  amp      lt   32614500   lt  GND rdf  gt  GND extract rdf

User · Answer

Quite simple using head tail  head -16482 in sql   tail -258  gt  out sql  using sed  sed -n  16224 16482p  in sql  gt  out sql  using awk  awk  NR gt  16224 amp  amp NR lt  16482  in sql  gt  out sql

User · Answer

I wrote a small bash script that you can run from your command line, so long as you update your PATH to include its directory (or you can place it in a directory that is already contained in the PATH).

Usage: $ pinch filename start-line end-line

#!/bin/bash
# Display line number ranges of a file to the terminal.
# Usage: $ pinch filename start-line end-line
# By Evan J. Coon

FILENAME=$1
START=$2
END=$3

ERROR="[PINCH ERROR]"

# Check that the number of arguments is 3
if [ $# -lt 3 ]; then
    echo "$ERROR Need three arguments: Filename Start-line End-line"
    exit 1
fi

# Check that the file exists.
if [ ! -f "$FILENAME" ]; then
    echo -e "$ERROR File does not exist. \n\t$FILENAME"
    exit 1
fi

# Check that start-line is not greater than end-line
if [ "$START" -gt "$END" ]; then
    echo -e "$ERROR Start line is greater than End line."
    exit 1
fi

# Check that start-line is positive.
if [ "$START" -lt 0 ]; then
    echo -e "$ERROR Start line is less than 0."
    exit 1
fi

# Check that end-line is positive.
if [ "$END" -lt 0 ]; then
    echo -e "$ERROR End line is less than 0."
    exit 1
fi

NUMOFLINES=$(wc -l < "$FILENAME")

# Check that end-line is not greater than the number of lines in the file.
if [ "$END" -gt "$NUMOFLINES" ]; then
    echo -e "$ERROR End line is greater than number of lines in file."
    exit 1
fi

# The distance from the end of the file to end-line
ENDDIFF=$(( NUMOFLINES - END ))

# For larger files, this will run more quickly. If the distance from the
# end of the file to the end-line is less than the distance from the
# start of the file to the start-line, then start pinching from the
# bottom as opposed to the top.
if [ "$START" -lt "$ENDDIFF" ]; then
    < "$FILENAME" head -n $END | tail -n +$START
else
    < "$FILENAME" tail -n +$START | head -n $(( END-START+1 ))
fi

# Success
exit 0

User · Answer

sed -n  16224 16482p 16483q  filename  gt  newfile   From the sed manual      p -        Print out the pattern space  to the standard output   This command is usually only used in conjunction with the -n command-line option       n -       If auto-print is not disabled  print the pattern space  then  regardless  replace the pattern space with the next line of input  If   there is no more input then sed exits without processing any more   commands       q -   Exit sed without processing any more commands or input    Note that the current pattern space is printed if auto-print is not disabled with the -n option    and     Addresses in a sed script can be in any of the following forms       number       Specifying a line number will match only that line in the input       An address range can be specified by specifying two addresses   separated by a comma      An address range matches lines starting from   where the first address matches  and continues until the second   address matches  inclusively

User · Answer

You could use  vi  and then the following command    16224 16482w  tmp some-file   Alternatively    cat file   head -n 16482   tail -n 258   EDIT - Just to add explanation  you use head -n 16482 to display first 16482 lines then use tail -n 258 to get last 258 lines out of the first output

User · Answer

This might work for you  GNU sed    sed -ne  16224 16482w newfile  -e  16482q  file   or taking advantage of bash   sed -n   16224 16482w newfile n16482q  file

User · Answer

Quite simple using head tail  head -16482 in sql   tail -258  gt  out sql  using sed  sed -n  16224 16482p  in sql  gt  out sql  using awk  awk  NR gt  16224 amp  amp NR lt  16482  in sql  gt  out sql

User · Answer

This might work for you  GNU sed    sed -ne  16224 16482w newfile  -e  16482q  file   or taking advantage of bash   sed -n   16224 16482w newfile n16482q  file

User · Answer

I would use   awk  FNR  gt   16224  amp  amp  FNR  lt   16482  my file  gt  extracted txt   FNR contains the record  line  number of the line being read from the file

User · Answer

Using ruby   ruby -ne  puts                if     gt   32613500  amp  amp      lt   32614500   lt  GND rdf  gt  GND extract rdf

User · Answer

I would use   awk  FNR  gt   16224  amp  amp  FNR  lt   16482  my file  gt  extracted txt   FNR contains the record  line  number of the line being read from the file

User · Answer

sed -n  16224 16482p 16483q  filename  gt  newfile   From the sed manual      p -        Print out the pattern space  to the standard output   This command is usually only used in conjunction with the -n command-line option       n -       If auto-print is not disabled  print the pattern space  then  regardless  replace the pattern space with the next line of input  If   there is no more input then sed exits without processing any more   commands       q -   Exit sed without processing any more commands or input    Note that the current pattern space is printed if auto-print is not disabled with the -n option    and     Addresses in a sed script can be in any of the following forms       number       Specifying a line number will match only that line in the input       An address range can be specified by specifying two addresses   separated by a comma      An address range matches lines starting from   where the first address matches  and continues until the second   address matches  inclusively

User · Answer

Just benchmarking 3 solutions given above, that works to me:

awk
sed
"head+tail"

Credits on the 3 solutions goes to:

@boxxar
@avandeursen
@wds
@manveru
@sibaz
@SOFe
@fedorqui 'SO stop harming'
@Robin A. Meade

I'm using a huge file I find in my server:

# wc fo2debug.1.log
   10421186    19448208 38795491134 fo2debug.1.log

38 Gb in 10.4 million lines.

And yes, I have a logrotate problem. : ))

Make your bets!

Getting 256 lines from the beginning of the file.

# time sed -n '1001,1256p;1256q' fo2debug.1.log | wc -l
256

real    0m0,003s
user    0m0,000s
sys     0m0,004s

# time head -1256 fo2debug.1.log | tail -n +1001 | wc -l
256

real    0m0,003s
user    0m0,006s
sys     0m0,000s

# time awk 'NR==1001, NR==1256; NR==1256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,002s
user    0m0,004s
sys     0m0,000s

Awk won. Technical tie in second place between sed and "head+tail".

Getting 256 lines at the end of the first third of the file.

# time sed -n '3473001,3473256p;3473256q' fo2debug.1.log | wc -l
256

real    0m0,265s
user    0m0,242s
sys     0m0,024s

# time head -3473256 fo2debug.1.log | tail -n +3473001 | wc -l
256

real    0m0,308s
user    0m0,313s
sys     0m0,145s

# time awk 'NR==3473001, NR==3473256; NR==3473256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,393s
user    0m0,326s
sys     0m0,068s

Sed won. Followed by "head+tail" and, finally, awk.

Getting 256 lines at the end of the second third of the file.

# time sed -n '6947001,6947256p;6947256q' fo2debug.1.log | wc -l
A256

real    0m0,525s
user    0m0,462s
sys     0m0,064s

# time head -6947256 fo2debug.1.log | tail -n +6947001 | wc -l
256

real    0m0,615s
user    0m0,488s
sys     0m0,423s

# time awk 'NR==6947001, NR==6947256; NR==6947256 {exit}' fo2debug.1.log | wc -l
256

real    0m0,779s
user    0m0,650s
sys     0m0,130s

Same results.

Sed won. Followed by "head+tail" and, finally, awk.

Getting 256 lines near the end of the file.

# time sed -n '10420001,10420256p;10420256q' fo2debug.1.log | wc -l
256

real    1m50,017s
user    0m12,735s
sys     0m22,926s

# time head -10420256 fo2debug.1.log | tail -n +10420001 | wc -l
256

real    1m48,269s
user    0m42,404s
sys     0m51,015s

# time awk 'NR==10420001, NR==10420256; NR==10420256 {exit}' fo2debug.1.log | wc -l
256

real    1m49,106s
user    0m12,322s
sys     0m18,576s

And suddenly, a twist!

"Head+tail" won. Followed by awk and, finally, sed.

(some hours later...)

Sorry guys!

My analysis above ends up being an example of a basic flaw in doing an analysis.

The flaw is not knowing in depth the resources used for the analysis.

In this case, I used a log file to analyze the performance of a search for a certain number of lines within it.

Using 3 different techniques, searches were made at different points in the file, comparing the performance of the techniques at each point and checking whether the results varied depending on the point in the file where the search was made.

My mistake was to assume that there was a certain homogeneity of content in the log file.

The reality is that long lines appear more frequently at the end of the file.

Thus, the apparent conclusion that longer searches (closer to the end of the file) are better with a given technique, may be biased. In fact, this technique may be better when dealing with longer lines. What remains to be confirmed.

User · Answer

Standing on the shoulders of boxxar  I like this   sed -n   lt first line gt   p  lt last line gt q  input   e g   sed -n  16224  p 16482q  input   The   means  last line   so the first command makes sed print all lines starting with line 16224 and the second command makes sed quit after printing line 16428   Adding 1 for the q-range in boxxar s solution does not seem to be necessary    I like this variant because I don t need to specify the ending line number twice  And I measured that using   does not have detrimental effects on performance

User · Answer

You could use  vi  and then the following command    16224 16482w  tmp some-file   Alternatively    cat file   head -n 16482   tail -n 258   EDIT - Just to add explanation  you use head -n 16482 to display first 16482 lines then use tail -n 258 to get last 258 lines out of the first output

User · Answer

I wrote a Haskell program called splitter that does exactly this: have a read through my release blog post.

You can use the program as follows:

$ cat somefile | splitter 16224-16482

And that is all that there is to it. You will need Haskell to install it. Just:

$ cabal install splitter

And you are done. I hope that you find this program useful.

[unix] How can I extract a predetermined range of lines from a text file on Unix?

The answer is

Make your bets!

Sorry guys!

Make your bets!

Sorry guys!

Make your bets!

Sorry guys!

Make your bets!

Sorry guys!

Examples related to unix

Examples related to command-line

Examples related to sed

Examples related to text-processing

Tags