[shell] How to delete from a text file, all lines that contain a specific string?

How would I use sed to delete all lines in a text file that contain a specific string?

This question is related to shell sed text-parsing in-place

The answer is


There are many other ways to delete lines with specific string besides sed:

AWK

awk '!/pattern/' file > temp && mv temp file

Ruby (1.9+)

ruby -i.bak -ne 'print if not /test/' file

Perl

perl -ni.bak -e "print unless /pattern/" file

Shell (bash 3.2 and later)

while read -r line
do
  [[ ! $line =~ pattern ]] && echo "$line"
done <file > o
mv o file

GNU grep

grep -v "pattern" file > temp && mv temp file

And of course sed (printing the inverse is faster than actual deletion):

sed -n '/pattern/!p' file

to show the treated text in console

cat filename | sed '/text to remove/d' 

to save treated text into a file

cat filename | sed '/text to remove/d' > newfile

to append treated text info an existing file

cat filename | sed '/text to remove/d' >> newfile

to treat already treated text, in this case remove more lines of what has been removed

cat filename | sed '/text to remove/d' | sed '/remove this too/d' | more

the | more will show text in chunks of one page at a time.


Delete lines from all files that match the match

grep -rl 'text_to_search' . | xargs sed -i '/text_to_search/d'

Just in case someone wants to do it for exact matches of strings, you can use the -w flag in grep - w for whole. That is, for example if you want to delete the lines that have number 11, but keep the lines with number 111:

-bash-4.1$ head file
1
11
111

-bash-4.1$ grep -v "11" file
1

-bash-4.1$ grep -w -v "11" file
1
111

It also works with the -f flag if you want to exclude several exact patterns at once. If "blacklist" is a file with several patterns on each line that you want to delete from "file":

grep -w -v -f blacklist file

You may consider using ex (which is a standard Unix command-based editor):

ex +g/match/d -cwq file

where:

  • + executes given Ex command (man ex), same as -c which executes wq (write and quit)
  • g/match/d - Ex command to delete lines with given match, see: Power of g

The above example is a POSIX-compliant method for in-place editing a file as per this post at Unix.SE and POSIX specifications for ex.


The difference with sed is that:

sed is a Stream EDitor, not a file editor.BashFAQ

Unless you enjoy unportable code, I/O overhead and some other bad side effects. So basically some parameters (such as in-place/-i) are non-standard FreeBSD extensions and may not be available on other operating systems.


You can use sed to replace lines in place in a file. However, it seems to be much slower than using grep for the inverse into a second file and then moving the second file over the original.

e.g.

sed -i '/pattern/d' filename      

or

grep -v "pattern" filename > filename2; mv filename2 filename

The first command takes 3 times longer on my machine anyway.


echo -e "/thing_to_delete\ndd\033:x\n" | vim file_to_edit.txt


I was struggling with this on Mac. Plus, I needed to do it using variable replacement.

So I used:

sed -i '' "/$pattern/d" $file

where $file is the file where deletion is needed and $pattern is the pattern to be matched for deletion.

I picked the '' from this comment.

The thing to note here is use of double quotes in "/$pattern/d". Variable won't work when we use single quotes.


Curiously enough, the accepted answer does not actually answer the question directly. The question asks about using sed to replace a string, but the answer seems to presuppose knowledge of how to convert an arbitrary string into a regex.

Many programming language libraries have a function to perform such a transformation, e.g.

python: re.escape(STRING)
ruby: Regexp.escape(STRING)
java:  Pattern.quote(STRING)

But how to do it on the command line?

Since this is a sed-oriented question, one approach would be to use sed itself:

sed 's/\([\[/({.*+^$?]\)/\\\1/g'

So given an arbitrary string $STRING we could write something like:

re=$(sed 's/\([\[({.*+^$?]\)/\\\1/g' <<< "$STRING")
sed "/$re/d" FILE

or as a one-liner:

 sed "/$(sed 's/\([\[/({.*+^$?]\)/\\\1/g' <<< "$STRING")/d" 

with variations as described elsewhere on this page.


You can also delete a range of lines in a file. For example to delete stored procedures in a SQL file.

sed '/CREATE PROCEDURE.*/,/END ;/d' sqllines.sql

This will remove all lines between CREATE PROCEDURE and END ;.

I have cleaned up many sql files withe this sed command.


perl -i    -nle'/regexp/||print' file1 file2 file3
perl -i.bk -nle'/regexp/||print' file1 file2 file3

The first command edits the file(s) inplace (-i).

The second command does the same thing but keeps a copy or backup of the original file(s) by adding .bk to the file names (.bk can be changed to anything).


The easy way to do it, with GNU sed:

sed --in-place '/some string here/d' yourfile


You can use good old ed to edit a file in a similar fashion to the answer that uses ex. The big difference in this case is that ed takes its commands via standard input, not as command line arguments like ex can. When using it in a script, the usual way to accomodate this is to use printf to pipe commands to it:

printf "%s\n" "g/pattern/d" w | ed -s filename

or with a heredoc:

ed -s filename <<EOF
g/pattern/d
w
EOF

You can also use this:

 grep -v 'pattern' filename

Here -v will print only other than your pattern (that means invert match).


To get a inplace like result with grep you can do this:

echo "$(grep -v "pattern" filename)" >filename

cat filename | grep -v "pattern" > filename.1
mv filename.1 filename

I have made a small benchmark with a file which contains approximately 345 000 lines. The way with grep seems to be around 15 times faster than the sed method in this case.

I have tried both with and without the setting LC_ALL=C, it does not seem change the timings significantly. The search string (CDGA_00004.pdbqt.gz.tar) is somewhere in the middle of the file.

Here are the commands and the timings:

time sed -i "/CDGA_00004.pdbqt.gz.tar/d" /tmp/input.txt

real    0m0.711s
user    0m0.179s
sys     0m0.530s

time perl -ni -e 'print unless /CDGA_00004.pdbqt.gz.tar/' /tmp/input.txt

real    0m0.105s
user    0m0.088s
sys     0m0.016s

time (grep -v CDGA_00004.pdbqt.gz.tar /tmp/input.txt > /tmp/input.tmp; mv /tmp/input.tmp /tmp/input.txt )

real    0m0.046s
user    0m0.014s
sys     0m0.019s

Examples related to shell

Comparing a variable with a string python not working when redirecting from bash script Get first line of a shell command's output How to run shell script file using nodejs? Run bash command on jenkins pipeline Way to create multiline comments in Bash? How to do multiline shell script in Ansible How to check if a file exists in a shell script How to check if an environment variable exists and get its value? Curl to return http status code along with the response docker entrypoint running bash script gets "permission denied"

Examples related to sed

Retrieve last 100 lines logs How to replace multiple patterns at once with sed? Insert multiple lines into a file after specified pattern using shell script Linux bash script to extract IP address Ansible playbook shell output remove white space from the end of line in linux bash, extract string before a colon invalid command code ., despite escaping periods, using sed RE error: illegal byte sequence on Mac OS X How to use variables in a command in sed?

Examples related to text-parsing

How to delete from a text file, all lines that contain a specific string?

Examples related to in-place

Understanding inplace=True What is the difference between `sorted(list)` vs `list.sort()`? How can I convert tabs to spaces in every file of a directory? Python Math - TypeError: 'NoneType' object is not subscriptable In-place edits with sed on OS X How to delete from a text file, all lines that contain a specific string? How to sort in-place using the merge sort algorithm? Updating a java map entry