Remove lines that contain certain string

Question

I m trying to read a text from a text file  read lines  delete lines that contain specific string  in this case  bad  and  naughty    The code I wrote goes like this   infile   file    oldfile txt    newopen   open    newfile txt    w   for line in infile        if  bad  in line          line   line replace               if  naughty  in line          line   line replace              else          newopen write line   newopen close     I wrote like this but it doesn t work out   One thing important is  if the content of the text was like this   good baby bad boy good boy normal boy   I don t want the output to have empty lines  so not like   good baby  good boy normal boy   but like this   good baby good boy normal boy   What should I edit from my code on the above

User · Answer

bad words     doc     strickland     n    with open  linetest txt   as oldfile  open  linetestnew txt    w   as newfile      for line in oldfile          if not any bad word in line for bad word in bad words               newfile write line    The  n is a Unicode escape sequence for a newline

User · Answer

Today I needed to accomplish a similar task so I wrote up a gist to accomplish the task based on some research I did   I hope that someone will find this useful   import os  os system  cls  if os name     nt  else  clear    oldfile   raw input      Enter the file  with extension  you would like to strip domains from     newfile   raw input      Enter the name of the file  with extension  you would like me to save      emailDomains     windstream net    mail com    google com    web de    email    yandex ru    ymail    mail eu    mail bg    comcast net    yahoo    Yahoo    gmail    Gmail    GMAIL    hotmail    comcast    bellsouth net    verizon net    att net    roadrunner com    charter net    mail ru     live    icloud     aol    facebook    outlook    myspace    rocketmail    print   n    This script will remove records that contain the following strings   n n   emailDomains  raw input   n    Press any key to start    n    linecounter   0  with open oldfile  as oFile  open newfile   w   as nFile      for line in oFile          if not any domain in line for domain in emailDomains               nFile write line              linecounter   linecounter   1             print      -   s  Writing verified record to  s ---   s     linecounter  newfile  line   print          COMPLETE          print       s was saved    newfile print      There are  s records in your saved file     linecounter   Link to Gist  emailStripper py  Best  Az

User · Answer

I have used this to remove unwanted words from text files   bad words     abc    def    ghi    jkl    with open  List of words txt   as badfile  open  Clean list of words txt    w   as cleanfile      for line in badfile          clean   True         for word in bad words              if word in line                  clean   False         if clean    True              cleanfile write line    Or to do the same for all files in a directory   import os  bad words     abc    def    ghi    jkl    for root  dirs  files in os walk      topdown   True       for file in files          if   txt  in file              with open file  as filename  open  clean   file   w   as cleanfile                  for line in filename                      clean   True                     for word in bad words                          if word in line                              clean   False                     if clean    True                          cleanfile write line    I m sure there must be a more elegant way to do it  but this did what I wanted it to

User · Answer

Use python-textops package    from textops import     oldfile txt    cat     grepv  bad     tofile  newfile txt

User · Answer

You could simply not include the line into the new file instead of doing replace   for line in infile        if  bad  not in line and  naughty  not in line              newopen write line

User · Answer

to skip     bad    naughty   out handle   open  testout    w    with open  testin    r   as handle      for line in handle          if set line split       intersection to skip               continue         out handle write line  out handle close

User · Answer

The else is only connected to the last if  You want elif   if  bad  in line      pass elif  naughty  in line      pass else      newopen write line    Also note that I removed the line substitution  as you don t write those lines anyway

User · Answer

You can make your code simpler and more readable like this  bad words     bad    naughty    with open  oldfile txt   as oldfile  open  newfile txt    w   as newfile      for line in oldfile          if not any bad word in line for bad word in bad words               newfile write line    using a Context Manager and any

User · Answer

Regex is a little quicker than the accepted answer  for my 23 MB test file  that I used  But there isn t a lot in it  import re  bad words     bad    naughty    regex   f quot           join bad words      n quot  subst    quot  quot   with open  oldfile txt   as oldfile      lines   oldfile read    result   re sub regex  subst  lines  re MULTILINE    with open  newfile txt    w   as newfile      newfile write result

[python] Remove lines that contain certain string

Examples related to python

Examples related to line