[python] Replace and overwrite instead of appending

I have the following code:

import re
#open the xml file for reading:
file = open('path/test.xml','r+')
#convert to string:
data = file.read()
file.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>",data))
file.close()

where I'd like to replace the old content that's in the file with the new content. However, when I execute my code, the file "test.xml" is appended, i.e. I have the old content follwed by the new "replaced" content. What can I do in order to delete the old stuff and only keep the new?

This question is related to python replace

The answer is


Using truncate(), the solution could be

import re
#open the xml file for reading:
with open('path/test.xml','r+') as f:
    #convert to string:
    data = f.read()
    f.seek(0)
    f.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>",data))
    f.truncate()

See from How to Replace String in File works in a simple way and is an answer that works with replace

fin = open("data.txt", "rt")
fout = open("out.txt", "wt")

for line in fin:
    fout.write(line.replace('pyton', 'python'))

fin.close()
fout.close()

file='path/test.xml' 
with open(file, 'w') as filetowrite:
    filetowrite.write('new content')

Open the file in 'w' mode, you will be able to replace its current text save the file with new contents.


Using python3 pathlib library:

import re
from pathlib import Path
import shutil

shutil.copy2("/tmp/test.xml", "/tmp/test.xml.bak") # create backup
filepath = Path("/tmp/test.xml")
content = filepath.read_text()
filepath.write_text(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", content))

Similar method using different approach to backups:

from pathlib import Path

filepath = Path("/tmp/test.xml")
filepath.rename(filepath.with_suffix('.bak')) # different approach to backups
content = filepath.read_text()
filepath.write_text(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", content))

import os#must import this library
if os.path.exists('TwitterDB.csv'):
        os.remove('TwitterDB.csv') #this deletes the file
else:
        print("The file does not exist")#add this to prevent errors

I had a similar problem, and instead of overwriting my existing file using the different 'modes', I just deleted the file before using it again, so that it would be as if I was appending to a new file on each run of my code.