[python] How to remove specific substrings from a set of strings in Python?

I have a set of strings set1, and all the strings in set1 have a two specific substrings which I don't need and want to remove.
Sample Input: set1={'Apple.good','Orange.good','Pear.bad','Pear.good','Banana.bad','Potato.bad'}
So basically I want the .good and .bad substrings removed from all the strings.
What I tried:

for x in set1:
    x.replace('.good','')
    x.replace('.bad','')

But this doesn't seem to work at all. There is absolutely no change in the output and it is the same as the input. I tried using for x in list(set1) instead of the original one but that doesn't change anything.

This question is related to python python-3.x

The answer is


>>> x = 'Pear.good'
>>> y = x.replace('.good','')
>>> y
'Pear'
>>> x
'Pear.good'

.replace doesn't change the string, it returns a copy of the string with the replacement. You can't change the string directly because strings are immutable.

You need to take the return values from x.replace and put them in a new set.


I did the test (but it is not your example) and the data does not return them orderly or complete

>>> ind = ['p5','p1','p8','p4','p2','p8']
>>> newind = {x.replace('p','') for x in ind}
>>> newind
{'1', '2', '8', '5', '4'}

I proved that this works:

>>> ind = ['p5','p1','p8','p4','p2','p8']
>>> newind = [x.replace('p','') for x in ind]
>>> newind
['5', '1', '8', '4', '2', '8']

or

>>> newind = []
>>> ind = ['p5','p1','p8','p4','p2','p8']
>>> for x in ind:
...     newind.append(x.replace('p',''))
>>> newind
['5', '1', '8', '4', '2', '8']

if you delete something from list , u can use this way : (method sub is case sensitive)

new_list = []
old_list= ["ABCDEFG","HKLMNOP","QRSTUV"]

for data in old_list:
     new_list.append(re.sub("AB|M|TV", " ", data))

print(new_list) // output : [' CDEFG', 'HKL NOP', 'QRSTUV']

You could do this:

import re
import string
set1={'Apple.good','Orange.good','Pear.bad','Pear.good','Banana.bad','Potato.bad'}

for x in set1:
    x.replace('.good',' ')
    x.replace('.bad',' ')
    x = re.sub('\.good$', '', x)
    x = re.sub('\.bad$', '', x)
    print(x)

If list

I was doing something for a list which is a set of strings and you want to remove all lines that have a certain substring you can do this

import re
def RemoveInList(sub,LinSplitUnOr):
    indices = [i for i, x in enumerate(LinSplitUnOr) if re.search(sub, x)]
    A = [i for j, i in enumerate(LinSplitUnOr) if j not in indices]
    return A

where sub is a patter that you do not wish to have in a list of lines LinSplitUnOr

for example

A=['Apple.good','Orange.good','Pear.bad','Pear.good','Banana.bad','Potato.bad']
sub = 'good'
A=RemoveInList(sub,A)

Then A will be

enter image description here


All you need is a bit of black magic!

>>> a = ["cherry.bad","pear.good", "apple.good"]
>>> a = list(map(lambda x: x.replace('.good','').replace('.bad',''),a))
>>> a
['cherry', 'pear', 'apple']

When there are multiple substrings to remove, one simple and effective option is to use re.sub with a compiled pattern that involves joining all the substrings-to-remove using the regex OR (|) pipe.

import re

to_remove = ['.good', '.bad']
strings = ['Apple.good','Orange.good','Pear.bad']

p = re.compile('|'.join(map(re.escape, to_remove))) # escape to handle metachars
[p.sub('', s) for s in strings]
# ['Apple', 'Orange', 'Pear']

# practices 2
str = "Amin Is A Good Programmer"
new_set = str.replace('Good', '')
print(new_set)

 

print : Amin Is A  Programmer

Update for Python 3.9

In python 3.9 you could remove suffix using str.removesuffix('suffix')

From the docs,

If the string ends with the suffix string and that suffix is not empty, return string[:-len(suffix)]. Otherwise, return a copy of the original string:

set1  = {'Apple.good','Orange.good','Pear.bad','Pear.good','Banana.bad','Potato.bad'}

set2 = set()

for s in set1:
   set2.add(s.removesuffix(".good").removesuffix(".bad"))

or using set comprehension:

set2 = {s.removesuffix(".good").removesuffix(".bad") for s in set1}
   
print(set2)


Output:
{'Orange', 'Pear', 'Apple', 'Banana', 'Potato'}