How to do sed like text replace with python

Question

I would like to enable all apt repositories in this file  cat  etc apt sources list    Note  this file is written by cloud-init on first boot of an instance                                                                                                                modifications made here will not survive a re-bundle                                                                                                                                 if you wish to make changes you can                                                                                                                                                  a   add  apt preserve sources list  true  to  etc cloud cloud cfg                                                                                                                        or do the same in user-data    b   add sources in  etc apt sources list d                                                                                                                                                                                                                                                                                                                                See http   help ubuntu com community UpgradeNotes for how to upgrade to                                                                                                              newer versions of the distribution                                                                                                                                                 deb http   us-east-1 ec2 archive ubuntu com ubuntu  maverick main                                                                                                                    deb-src http   us-east-1 ec2 archive ubuntu com ubuntu  maverick main                                                                                                                    Major bug fix updates produced after the final release of the                                                                                                                        distribution                                                                                                                                                                      deb http   us-east-1 ec2 archive ubuntu com ubuntu  maverick-updates main                                                                                                            deb-src http   us-east-1 ec2 archive ubuntu com ubuntu  maverick-updates main                                                                                                            N B  software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu                                                                                                             team  Also  please note that software in universe WILL NOT receive any                                                                                                               review or updates from the Ubuntu security team                                                                                                                                   deb http   us-east-1 ec2 archive ubuntu com ubuntu  maverick universe                                                                                                                deb-src http   us-east-1 ec2 archive ubuntu com ubuntu  maverick universe                                                                                                            deb http   us-east-1 ec2 archive ubuntu com ubuntu  maverick-updates universe deb-src http   us-east-1 ec2 archive ubuntu com ubuntu  maverick-updates universe     N B  software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu     team  and may not be under a free licence  Please satisfy yourself as to    your rights to use the software  Also  please note that software in     multiverse WILL NOT receive any review or updates from the Ubuntu    security team    deb http   us-east-1 ec2 archive ubuntu com ubuntu  maverick multiverse   deb-src http   us-east-1 ec2 archive ubuntu com ubuntu  maverick multiverse   deb http   us-east-1 ec2 archive ubuntu com ubuntu  maverick-updates multiverse   deb-src http   us-east-1 ec2 archive ubuntu com ubuntu  maverick-updates multiverse     Uncomment the following two lines to add software from the  backports     repository     N B  software from this repository may not have been tested as    extensively as that contained in the main release  although it includes    newer versions of some applications which may provide useful features     Also  please note that software in backports WILL NOT receive any review    or updates from the Ubuntu security team    deb http   us-east-1 ec2 archive ubuntu com ubuntu  maverick-backports main restricted universe multiverse   deb-src http   us-east-1 ec2 archive ubuntu com ubuntu  maverick-backports main restricted universe multiverse     Uncomment the following two lines to add software from Canonical s     partner  repository     This software is not part of Ubuntu  but is offered by Canonical and the    respective vendors as a service to Ubuntu users    deb http   archive canonical com ubuntu maverick partner   deb-src http   archive canonical com ubuntu maverick partner  deb http   security ubuntu com ubuntu maverick-security main deb-src http   security ubuntu com ubuntu maverick-security main deb http   security ubuntu com ubuntu maverick-security universe deb-src http   security ubuntu com ubuntu maverick-security universe   deb http   security ubuntu com ubuntu maverick-security multiverse   deb-src http   security ubuntu com ubuntu maverick-security multiverse   With sed this is a simple sed -i  s    deb deb    etc apt sources list what s the most elegant   pythonic   way to do this

User · Answer

Cecil Curry has a great answer  however his answer only works for multiline regular expressions  Multiline regular expressions are more rarely used  but they are handy sometimes   Here is an improvement upon his sed inplace function that allows it to function with multiline regular expressions if asked to do so   WARNING  In multiline mode  it will read the entire file in  and then perform the regular expression substitution  so you ll only want to use this mode on small-ish files - don t try to run this on gigabyte-sized files when running in multiline mode   import re  shutil  tempfile  def sed inplace filename  pattern  repl  multiline   False               Perform the pure-Python equivalent of in-place  sed  substitution  e g        sed -i -e  s    pattern      repl      filename                 re flags   0     if multiline          re flags   re M        For efficiency  precompile the passed regular expression      pattern compiled   re compile pattern  re flags         For portability  NamedTemporaryFile   defaults to mode  w b   i e   binary       writing with updating   This is usually a good thing  In this case        however  binary writing imposes non-trivial encoding constraints trivially       resolved by switching to text writing  Let s do that      with tempfile NamedTemporaryFile mode  w   delete False  as tmp file          with open filename  as src file              if multiline                  content   src file read                   tmp file write pattern compiled sub repl  content               else                  for line in src file                      tmp file write pattern compiled sub repl  line          Overwrite the original file with the munged temporary file in a       manner preserving file attributes  e g   permissions       shutil copystat filename  tmp file name      shutil move tmp file name  filename   from os path import expanduser sed inplace   s  gitconfig    expanduser       r     user    n   t  name         n   t  email         r  1John Doe 2jdoe example com   multiline True

User · Answer

You could do something like   p   re compile       deb   re MULTILINE  text   open  sources list    r   read   f   open  sources list    w   f write p sub  deb   text   f close     Alternatively  imho  this is better from organizational standpoint  you could split your sources list into pieces  one entry one repository  and place them under  etc apt sources list d

User · Answer

Try pysed  pysed -r    deb   deb   etc apt sources list

User · Answer

If I want something like sed  then I usually just call sed itself using the sh library   from sh import sed  sed   -i    s    deb deb      etc apt sources list      Sure  there are downsides   Like maybe the locally installed version of sed isn t the same as the one you tested with   In my cases  this kind of thing can be easily handled at another layer  like by examining the target environment beforehand  or deploying in a docker image with a known version of sed

User · Answer

This is such a different approach  I don t want to edit my other answer  Nested with since I don t use 3 1  Where with A   as a  B   as b  works    Might be a bit overkill to change sources list  but I want to put it out there for future searches      usr bin env python from shutil   import move from tempfile import NamedTemporaryFile  with NamedTemporaryFile delete False  as tmp sources      with open  sources list   as sources file          for line in sources file              if line startswith    deb                    tmp sources write line 2                else                  tmp sources write line   move tmp sources name  sources file name    This should ensure no race conditions of other people reading the file  Oh  and I prefer str startswith      when you can do without a regexp

User · Answer

I wanted to be able to find and replace text but also include matched groups in the content I insert   I wrote this short script to do that   https   gist github com turtlemonvh 0743a1c63d1d27df3f17  The key component of that is something that looks like like this   print re sub pattern  template  text  rstrip   n      Here s an example of how that works     Find everything that looks like  dog  or  cat  followed by a space and a number pattern      cat dog    d        Replace with  turtle  and the number   3  because the number is the 3rd matched group    The double     is needed because you need to escape     when running this in a python shell template    turtle   3     The text to operate on text    cat 976 is my favorite    Calling the above function with this yields   turtle 976 is my favorite

User · Answer

Authoring a homegrown sed replacement in pure Python with no external commands or additional dependencies is a noble task laden with noble landmines  Who would have thought   Nonetheless  it is feasible  It s also desirable  We ve all been there  people   I need to munge some plaintext files  but I only have Python  two plastic shoelaces  and a moldy can of bunker-grade Maraschino cherries  Help    In this answer  we offer a best-of-breed solution cobbling together the awesomeness of prior answers without all of that unpleasant not-awesomeness  As plundra notes  David Miller s otherwise top-notch answer writes the desired file non-atomically and hence invites race conditions  e g   from other threads and or processes attempting to concurrently read that file   That s bad  Plundra s otherwise excellent answer solves that issue while introducing yet more     including numerous fatal encoding errors  a critical security vulnerability  failing to preserve the permissions and other metadata of the original file   and premature optimization replacing regular expressions with low-level character indexing  That s also bad   Awesomeness  unite   import re  shutil  tempfile  def sed inplace filename  pattern  repl               Perform the pure-Python equivalent of in-place  sed  substitution  e g        sed -i -e  s    pattern      repl      filename                   For efficiency  precompile the passed regular expression      pattern compiled   re compile pattern         For portability  NamedTemporaryFile   defaults to mode  w b   i e   binary       writing with updating   This is usually a good thing  In this case        however  binary writing imposes non-trivial encoding constraints trivially       resolved by switching to text writing  Let s do that      with tempfile NamedTemporaryFile mode  w   delete False  as tmp file          with open filename  as src file              for line in src file                  tmp file write pattern compiled sub repl  line          Overwrite the original file with the munged temporary file in a       manner preserving file attributes  e g   permissions       shutil copystat filename  tmp file name      shutil move tmp file name  filename     Do it for Johnny  sed inplace   etc apt sources list   r     deb    deb

User · Answer

Not sure about elegant  but this ought to be pretty readable at least  For a sources list it s fine to read all the lines before hand  for something larger you might want to change  in place  while looping through it      usr bin env python   Open file for reading and writing with open  sources list    r    as sources file        Read all the lines     lines   sources file readlines          Rewind and truncate     sources file seek 0      sources file truncate          Loop through the lines  adding them back to the file      for line in lines          if line startswith    deb                sources file write line 2            else              sources file write line    EDIT  Use with-statement for better file-handling  Also forgot to rewind before truncate before

User · Answer

You can do that like this   with open   etc apt sources list    r   as sources      lines   sources readlines   with open   etc apt sources list    w   as sources      for line in lines          sources write re sub r    deb    deb   line     The with statement ensures that the file is closed correctly  and re-opening the file in  w  mode empties the file before you write to it  re sub pattern  replace  string  is the equivalent of s pattern replace  in sed perl   Edit  fixed syntax in example

User · Answer

If you really want to use a sed command without installing a new Python module  you could simply do the following     import subprocess subprocess call  sed command

User · Answer

Here s a one-module Python replacement for perl -p     Provide compatibility with  perl -p     Usage          python -mloop over stdin lines   lt program gt      In    lt program gt    use the variable  line  to read and change the current line     Example              python -mloop over stdin lines  line   re sub  pattern    replacement   line      From the perlrun documentation             -p   causes Perl to assume the following loop around your               program  which makes it iterate over filename arguments               somewhat like sed                     LINE                    while   lt  gt                                             your program goes here                     continue                         print or die  -p destination     n                                        If a file named by an argument cannot be opened for some               reason  Perl warns you about it  and moves on to the next               file  Note that the lines are printed automatically  An               error occurring during printing is treated as fatal  To               suppress printing use the -n switch  A -p overrides a -n               switch                    BEGIN  and  END  blocks may be used to capture control               before or after the implicit loop  just as in awk      import re import sys  for line in sys stdin      exec sys argv 1   globals    locals        try          print line      except          sys exit  -p destination     n

User · Answer

massedit py  http   github com elmotec massedit  does the scaffolding for you leaving just the regex to write  It s still in beta but we are looking for feedback   python -m massedit -e  re sub r    deb    deb   line    etc apt sources list   will show the differences  before after  in diff format    Add the -w option to write the changes to the original file   python -m massedit -e  re sub r    deb    deb   line   -w  etc apt sources list   Alternatively  you can now use the api     gt  gt  gt  import massedit  gt  gt  gt  filenames      etc apt sources list    gt  gt  gt  massedit edit files filenames    re sub r    deb    deb   line     dry run True

User · Answer

If you are using Python3 the following module will help you  https   github com mahmoudadel2 pysed  wget https   raw githubusercontent com mahmoudadel2 pysed master pysed py   Place the module file into your Python3 modules path  then   import pysed pysed replace  lt Old string gt    lt Replacement String gt    lt Text File gt   pysed rmlinematch  lt Unwanted string gt    lt Text File gt   pysed rmlinenumber  lt Unwanted Line Number gt    lt Text File gt

User · Answer

None of the answers works properly above    I have a case of multiple key-value replacement in one file around 1000 lines  And after replacement the file structure should keep the same  for example  key1 value tobe replaced1 key2 value tobe replaced1                 key1000 value tobe replaced1000  I ve tried   the voted answer from  elmotec for massedit   answer from  Cecil Curry   answer from   Keithel    The three answers definitely helped me a lot but after test I found it costs nearly 40-50s for 1st and 2ed  3rd is not suitable for multi-replacement so I fixed it  Notice  refer to the answers before go on  Here s my code  Line replacement mode  start time   datetime datetime now   with tempfile NamedTemporaryFile mode  w   delete False  as tmp file      with open abs keypair file  as kf          for line in kf              line to write                  match flag   False             for  key  value  in tuple list                    print     s    r     key  value                  if  not re search patten  line  flags re I                       continue                 line to write   re sub r           format key   value  line  flags re I                  match flag   True              if not match flag                  line to write   line             tmp file write line to write   shutil copystat abs keypair file  tmp file name  shutil move tmp file name  abs keypair file   time costs   datetime datetime now   - start time print  time costs   s    time costs  time costs  0 00 42 533879  file replacement mode  start time   datetime datetime now   with tempfile NamedTemporaryFile mode  w   delete False  as tmp file      with open abs keypair file  as kf          text   kf read           for  key  value  in tuple list              text   re sub patten  value  text  flags re M re I          tmp file write text  shutil copystat abs keypair file  tmp file name  shutil move tmp file name  abs keypair file   time costs   datetime datetime now   - start time print  time costs   s    time costs  time costs  0 00 00 348458  So I suggest if you match my case and your file size is not too large you may follow file replacement mode  How to replace if file size is huge  I have no idea  Hope this helps

[python] How to do sed like text replace with python?

Examples related to python

Examples related to regex

Examples related to linux