How can I remove the first line of a text file using bash sed script

Question

I need to repeatedly remove the first line from a huge text file using a bash script   Right now I am using sed -i -e  1d   FILE - but it takes around a minute to do the deletion   Is there a more efficient way to accomplish this

User · Answer

If you want to modify the file in place  you could always use the original ed instead of its streaming successor sed   ed   FILE   lt  lt  lt   1d nwq n    The ed command was the original UNIX text editor  before there were even full-screen terminals  much less graphical workstations  The ex editor  best known as what you re using when typing at the colon prompt in vi  is an extended version of ed  so many of the same commands work  While ed is meant to be used interactively  it can also be used in batch mode by sending a string of commands to it  which is what this solution does   The sequence  lt  lt  lt   1d nwq n  takes advantage of Bash s support for here-strings   lt  lt  lt   and POSIX quotes          to feed input to the ed command consisting of two lines  1d  which deletes line 1  and then wq  which writes the file back out to disk and then quits the editing session

User · Answer

As Pax said  you probably aren t going to get any faster than this   The reason is that there are almost no filesystems that support truncating from the beginning of the file so this is going to be an O n  operation where n is the size of the file   What you can do much faster though is overwrite the first line with the same number of bytes  maybe with spaces or a comment  which might work for you depending on exactly what you are trying to do  what is that by the way

User · Answer

Try tail   tail -n  2   FILE    -n x  Just print the last x lines  tail -n 5 would give you the last 5 lines of the input  The   sign kind of inverts the argument and make tail print anything but the first x-1 lines  tail -n  1 would print the whole file  tail -n  2 everything but the first line  etc   GNU tail is much faster than sed  tail is also available on BSD and the -n  2 flag is consistent across both tools  Check the FreeBSD or OS X man pages for more    The BSD version can be much slower than sed  though  I wonder how they managed that  tail should just read a file line by line while sed does pretty complex operations involving interpreting a script  applying regular expressions and the like   Note  You may be tempted to use    THIS WILL GIVE YOU AN EMPTY FILE  tail -n  2   FILE   gt    FILE    but this will give you an empty file  The reason is that the redirection   gt   happens before tail is invoked by the shell    Shell truncates file  FILE Shell creates a new process for tail Shell redirects stdout of the tail process to  FILE tail reads from the now empty  FILE   If you want to remove the first line inside the file  you should use   tail -n  2   FILE   gt    FILE tmp   amp  amp  mv   FILE tmp    FILE    The  amp  amp  will make sure that the file doesn t get overwritten when there is a problem

User · Answer

Try tail   tail -n  2   FILE    -n x  Just print the last x lines  tail -n 5 would give you the last 5 lines of the input  The   sign kind of inverts the argument and make tail print anything but the first x-1 lines  tail -n  1 would print the whole file  tail -n  2 everything but the first line  etc   GNU tail is much faster than sed  tail is also available on BSD and the -n  2 flag is consistent across both tools  Check the FreeBSD or OS X man pages for more    The BSD version can be much slower than sed  though  I wonder how they managed that  tail should just read a file line by line while sed does pretty complex operations involving interpreting a script  applying regular expressions and the like   Note  You may be tempted to use    THIS WILL GIVE YOU AN EMPTY FILE  tail -n  2   FILE   gt    FILE    but this will give you an empty file  The reason is that the redirection   gt   happens before tail is invoked by the shell    Shell truncates file  FILE Shell creates a new process for tail Shell redirects stdout of the tail process to  FILE tail reads from the now empty  FILE   If you want to remove the first line inside the file  you should use   tail -n  2   FILE   gt    FILE tmp   amp  amp  mv   FILE tmp    FILE    The  amp  amp  will make sure that the file doesn t get overwritten when there is a problem

User · Answer

No  that s about as efficient as you re going to get  You could write a C program which could do the job a little faster  less startup time and processing arguments  but it will probably tend towards the same speed as sed as files get large  and I assume they re large if it s taking a minute    But your question suffers from the same problem as so many others in that it pre-supposes the solution  If you were to tell us in detail what you re trying to do rather then how  we may be able to suggest a better option   For example  if this is a file A that some other program B processes  one solution would be to not strip off the first line  but modify program B to process it differently   Let s say all your programs append to this file A and program B currently reads and processes the first line before deleting it   You could re-engineer program B so that it didn t try to delete the first line but maintains a persistent  probably file-based  offset into the file A so that  next time it runs  it could seek to that offset  process the line there  and update the offset   Then  at a quiet time  midnight    it could do special processing of file A to delete all lines currently processed and set the offset back to 0   It will certainly be faster for a program to open and seek a file rather than open and rewrite  This discussion assumes you have control over program B  of course  I don t know if that s the case but there may be other possible solutions if you provide further information

User · Answer

Try tail   tail -n  2   FILE    -n x  Just print the last x lines  tail -n 5 would give you the last 5 lines of the input  The   sign kind of inverts the argument and make tail print anything but the first x-1 lines  tail -n  1 would print the whole file  tail -n  2 everything but the first line  etc   GNU tail is much faster than sed  tail is also available on BSD and the -n  2 flag is consistent across both tools  Check the FreeBSD or OS X man pages for more    The BSD version can be much slower than sed  though  I wonder how they managed that  tail should just read a file line by line while sed does pretty complex operations involving interpreting a script  applying regular expressions and the like   Note  You may be tempted to use    THIS WILL GIVE YOU AN EMPTY FILE  tail -n  2   FILE   gt    FILE    but this will give you an empty file  The reason is that the redirection   gt   happens before tail is invoked by the shell    Shell truncates file  FILE Shell creates a new process for tail Shell redirects stdout of the tail process to  FILE tail reads from the now empty  FILE   If you want to remove the first line inside the file  you should use   tail -n  2   FILE   gt    FILE tmp   amp  amp  mv   FILE tmp    FILE    The  amp  amp  will make sure that the file doesn t get overwritten when there is a problem

User · Answer

You can easily do this with   cat filename   sed 1d  gt  filename without first line   on the command line  or to remove the first line of a file permanently  use the in-place mode of sed with the -i flag   sed -i 1d  lt filename gt

User · Answer

Since it sounds like I can t speed up the deletion  I think a good approach might be to process the file in batches like this   While file1 not empty   file2   head -n1000 file1   process file2   sed -i -e  1000d  file1 end   The drawback of this is that if the program gets killed in the middle  or if there s some bad sql in there - causing the  process  part to die or lock-up   there will be lines that are either skipped  or processed twice    file1 contains lines of sql code

User · Answer

Try tail   tail -n  2   FILE    -n x  Just print the last x lines  tail -n 5 would give you the last 5 lines of the input  The   sign kind of inverts the argument and make tail print anything but the first x-1 lines  tail -n  1 would print the whole file  tail -n  2 everything but the first line  etc   GNU tail is much faster than sed  tail is also available on BSD and the -n  2 flag is consistent across both tools  Check the FreeBSD or OS X man pages for more    The BSD version can be much slower than sed  though  I wonder how they managed that  tail should just read a file line by line while sed does pretty complex operations involving interpreting a script  applying regular expressions and the like   Note  You may be tempted to use    THIS WILL GIVE YOU AN EMPTY FILE  tail -n  2   FILE   gt    FILE    but this will give you an empty file  The reason is that the redirection   gt   happens before tail is invoked by the shell    Shell truncates file  FILE Shell creates a new process for tail Shell redirects stdout of the tail process to  FILE tail reads from the now empty  FILE   If you want to remove the first line inside the file  you should use   tail -n  2   FILE   gt    FILE tmp   amp  amp  mv   FILE tmp    FILE    The  amp  amp  will make sure that the file doesn t get overwritten when there is a problem

User · Answer

should show the lines except the first line    cat textfile txt   tail -n  2

User · Answer

You can use -i to update the file without using     operator  The following command will delete the first line from the file and save it to the file   sed -i  1d  filename

User · Answer

As Pax said  you probably aren t going to get any faster than this   The reason is that there are almost no filesystems that support truncating from the beginning of the file so this is going to be an O n  operation where n is the size of the file   What you can do much faster though is overwrite the first line with the same number of bytes  maybe with spaces or a comment  which might work for you depending on exactly what you are trying to do  what is that by the way

User · Answer

You can edit the files in place  Just use perl s -i flag  like this   perl -ni -e  print unless       1  filename txt   This makes the first line disappear  as you ask  Perl will need to read and copy the entire file  but it arranges for the output to be saved under the name of the original file

User · Answer

This one liner will do   echo    tail -n  2   FILE     gt    FILE    It works  since tail is executed prior to echo and then the file is unlocked  hence no need for a temp file

User · Answer

No  that s about as efficient as you re going to get  You could write a C program which could do the job a little faster  less startup time and processing arguments  but it will probably tend towards the same speed as sed as files get large  and I assume they re large if it s taking a minute    But your question suffers from the same problem as so many others in that it pre-supposes the solution  If you were to tell us in detail what you re trying to do rather then how  we may be able to suggest a better option   For example  if this is a file A that some other program B processes  one solution would be to not strip off the first line  but modify program B to process it differently   Let s say all your programs append to this file A and program B currently reads and processes the first line before deleting it   You could re-engineer program B so that it didn t try to delete the first line but maintains a persistent  probably file-based  offset into the file A so that  next time it runs  it could seek to that offset  process the line there  and update the offset   Then  at a quiet time  midnight    it could do special processing of file A to delete all lines currently processed and set the offset back to 0   It will certainly be faster for a program to open and seek a file rather than open and rewrite  This discussion assumes you have control over program B  of course  I don t know if that s the case but there may be other possible solutions if you provide further information

User · Answer

Could use vim to do this   vim -u NONE   1d    wq    tmp test txt   This should be faster  since vim won t read whole file when process

User · Answer

No  that s about as efficient as you re going to get  You could write a C program which could do the job a little faster  less startup time and processing arguments  but it will probably tend towards the same speed as sed as files get large  and I assume they re large if it s taking a minute    But your question suffers from the same problem as so many others in that it pre-supposes the solution  If you were to tell us in detail what you re trying to do rather then how  we may be able to suggest a better option   For example  if this is a file A that some other program B processes  one solution would be to not strip off the first line  but modify program B to process it differently   Let s say all your programs append to this file A and program B currently reads and processes the first line before deleting it   You could re-engineer program B so that it didn t try to delete the first line but maintains a persistent  probably file-based  offset into the file A so that  next time it runs  it could seek to that offset  process the line there  and update the offset   Then  at a quiet time  midnight    it could do special processing of file A to delete all lines currently processed and set the offset back to 0   It will certainly be faster for a program to open and seek a file rather than open and rewrite  This discussion assumes you have control over program B  of course  I don t know if that s the case but there may be other possible solutions if you provide further information

User · Answer

If what you are looking to do is recover after failure  you could just build up a file that has what you ve done so far   if    -f  tmpf      then     rm -f  tmpf fi cat  srcf       while read line   do           process line         echo   line   gt  gt   tmpf     done

User · Answer

How about using csplit   man csplit csplit -k file 1   1

User · Answer

The sponge util avoids the need for juggling a temp file   tail -n  2   FILE    sponge   FILE

User · Answer

How about using csplit   man csplit csplit -k file 1   1

User · Answer

Would using tail on N-1 lines and directing that into a file  followed by removing the old file  and renaming the new file to the old name do the job   If i were doing this programatically  i would read through the file  and remember the file offset  after reading each line  so i could seek back to that position to read the file with one less line in it

User · Answer

You can easily do this with   cat filename   sed 1d  gt  filename without first line   on the command line  or to remove the first line of a file permanently  use the in-place mode of sed with the -i flag   sed -i 1d  lt filename gt

User · Answer

Would using tail on N-1 lines and directing that into a file  followed by removing the old file  and renaming the new file to the old name do the job   If i were doing this programatically  i would read through the file  and remember the file offset  after reading each line  so i could seek back to that position to read the file with one less line in it

User · Answer

Since it sounds like I can t speed up the deletion  I think a good approach might be to process the file in batches like this   While file1 not empty   file2   head -n1000 file1   process file2   sed -i -e  1000d  file1 end   The drawback of this is that if the program gets killed in the middle  or if there s some bad sql in there - causing the  process  part to die or lock-up   there will be lines that are either skipped  or processed twice    file1 contains lines of sql code

User · Answer

No  that s about as efficient as you re going to get  You could write a C program which could do the job a little faster  less startup time and processing arguments  but it will probably tend towards the same speed as sed as files get large  and I assume they re large if it s taking a minute    But your question suffers from the same problem as so many others in that it pre-supposes the solution  If you were to tell us in detail what you re trying to do rather then how  we may be able to suggest a better option   For example  if this is a file A that some other program B processes  one solution would be to not strip off the first line  but modify program B to process it differently   Let s say all your programs append to this file A and program B currently reads and processes the first line before deleting it   You could re-engineer program B so that it didn t try to delete the first line but maintains a persistent  probably file-based  offset into the file A so that  next time it runs  it could seek to that offset  process the line there  and update the offset   Then  at a quiet time  midnight    it could do special processing of file A to delete all lines currently processed and set the offset back to 0   It will certainly be faster for a program to open and seek a file rather than open and rewrite  This discussion assumes you have control over program B  of course  I don t know if that s the case but there may be other possible solutions if you provide further information

User · Answer

You can use -i to update the file without using     operator  The following command will delete the first line from the file and save it to the file   sed -i  1d  filename

User · Answer

Would using tail on N-1 lines and directing that into a file  followed by removing the old file  and renaming the new file to the old name do the job   If i were doing this programatically  i would read through the file  and remember the file offset  after reading each line  so i could seek back to that position to read the file with one less line in it

User · Answer

For those who are on SunOS which is non-GNU  the following code will help   sed  1d  test dat  gt  tmp dat

User · Answer

If what you are looking to do is recover after failure  you could just build up a file that has what you ve done so far   if    -f  tmpf      then     rm -f  tmpf fi cat  srcf       while read line   do           process line         echo   line   gt  gt   tmpf     done

User · Answer

As Pax said  you probably aren t going to get any faster than this   The reason is that there are almost no filesystems that support truncating from the beginning of the file so this is going to be an O n  operation where n is the size of the file   What you can do much faster though is overwrite the first line with the same number of bytes  maybe with spaces or a comment  which might work for you depending on exactly what you are trying to do  what is that by the way

User · Answer

As Pax said  you probably aren t going to get any faster than this   The reason is that there are almost no filesystems that support truncating from the beginning of the file so this is going to be an O n  operation where n is the size of the file   What you can do much faster though is overwrite the first line with the same number of bytes  maybe with spaces or a comment  which might work for you depending on exactly what you are trying to do  what is that by the way

User · Answer

Since it sounds like I can t speed up the deletion  I think a good approach might be to process the file in batches like this   While file1 not empty   file2   head -n1000 file1   process file2   sed -i -e  1000d  file1 end   The drawback of this is that if the program gets killed in the middle  or if there s some bad sql in there - causing the  process  part to die or lock-up   there will be lines that are either skipped  or processed twice    file1 contains lines of sql code

User · Answer

Would using tail on N-1 lines and directing that into a file  followed by removing the old file  and renaming the new file to the old name do the job   If i were doing this programatically  i would read through the file  and remember the file offset  after reading each line  so i could seek back to that position to read the file with one less line in it

User · Answer

You can edit the files in place  Just use perl s -i flag  like this   perl -ni -e  print unless       1  filename txt   This makes the first line disappear  as you ask  Perl will need to read and copy the entire file  but it arranges for the output to be saved under the name of the original file

User · Answer

This one liner will do   echo    tail -n  2   FILE     gt    FILE    It works  since tail is executed prior to echo and then the file is unlocked  hence no need for a temp file

User · Answer

The sponge util avoids the need for juggling a temp file   tail -n  2   FILE    sponge   FILE

User · Answer

Since it sounds like I can t speed up the deletion  I think a good approach might be to process the file in batches like this   While file1 not empty   file2   head -n1000 file1   process file2   sed -i -e  1000d  file1 end   The drawback of this is that if the program gets killed in the middle  or if there s some bad sql in there - causing the  process  part to die or lock-up   there will be lines that are either skipped  or processed twice    file1 contains lines of sql code

User · Answer

Could use vim to do this   vim -u NONE   1d    wq    tmp test txt   This should be faster  since vim won t read whole file when process

User · Answer

should show the lines except the first line    cat textfile txt   tail -n  2

User · Answer

For those who are on SunOS which is non-GNU  the following code will help   sed  1d  test dat  gt  tmp dat

User · Answer

If you want to modify the file in place  you could always use the original ed instead of its streaming successor sed   ed   FILE   lt  lt  lt   1d nwq n    The ed command was the original UNIX text editor  before there were even full-screen terminals  much less graphical workstations  The ex editor  best known as what you re using when typing at the colon prompt in vi  is an extended version of ed  so many of the same commands work  While ed is meant to be used interactively  it can also be used in batch mode by sending a string of commands to it  which is what this solution does   The sequence  lt  lt  lt   1d nwq n  takes advantage of Bash s support for here-strings   lt  lt  lt   and POSIX quotes          to feed input to the ed command consisting of two lines  1d  which deletes line 1  and then wq  which writes the file back out to disk and then quits the editing session

[bash] How can I remove the first line of a text file using bash/sed script?

Examples related to bash

Examples related to scripting

Examples related to sed