Remove specific characters from a string in Python

Question

I m trying to remove specific characters from a string using Python  This is the code I m using right now  Unfortunately it appears to do nothing to the string   for char in line      if char in                    line replace char       How do I do this properly

User · Answer

My method I d use probably wouldn t work as efficiently  but it is massively simple  I can remove multiple characters at different positions all at once  using slicing and formatting  Here s an example   words    things  removed     s s     words  3   words -1      This will result in  removed  holding the word  this    Formatting can be very helpful for printing variables midway through a print string  It can insert any data type using a   followed by the variable s data type  all data types can use  s  and floats  aka decimals  and integers can use  d    Slicing can be used for intricate control over strings  When I put words  3    it allows me to select all the characters in the string from the beginning  the colon is before the number  this will mean  from the beginning to   to the 4th character  it includes the 4th character   The reason 3 equals till the 4th position is because Python starts at 0  Then  when I put word -1    it means the 2nd last character to the end  the colon is behind the number   Putting -1 will make Python count from the last character  rather than the first  Again  Python will start at 0  So  word -1   basically means  from the second last character to the end of the string   So  by cutting off the characters before the character I want to remove and the characters after and sandwiching them together  I can remove the unwanted character  Think of it like a sausage  In the middle it s dirty  so I want to get rid of it  I simply cut off the two ends I want then put them together without the unwanted part in the middle    If I want to remove multiple consecutive characters  I simply shift the numbers around in the     slicing part   Or if I want to remove multiple characters from different positions  I can simply sandwich together multiple slices at once   Examples    words    control   removed     s s     words  2   words -2      removed equals  cool    words    impacts  removed     s s s     words 1   words 3 5   words -1     removed equals  macs    In this case   3 5  means character at position 3 through character at position 5  excluding the character at the final position     Remember  Python starts counting at 0  so you will need to as well

User · Answer

Here s my Python 2 3 compatible version  Since the translate api has changed    def remove str   chars          Removes each char in  chars  from  str         Args          str   String to remove characters from         chars  String of to-be removed characters      Returns          A copy of str  with  chars  removed      Example              remove  What     darn                gt   Whatdarn              try            Python2 x         return str  translate None  chars      except TypeError            Python 3 x         table    ord char   None for char in chars          return str  translate table

User · Answer

I was surprised that no one had yet recommended using the builtin filter function       import operator     import string   only for the example you could use a custom string      s    1212edjaq    Say we want to filter out everything that isn t a number  Using the filter builtin method     is equivalent to the generator expression  item for item in iterable if function item     Python 3 Builtins  Filter       sList   list s      intsList   list string digits      obj   filter lambda x  operator contains intsList  x   sList      In Python 3 this returns        gt  gt    lt filter object   hex gt    To get a printed string       nums      join list obj       print nums       gt  gt   1212    I am not sure how filter ranks in terms of efficiency but it is a good thing to know how to use when doing list comprehensions and such   UPDATE  Logically  since filter works you could also use list comprehension and from what I have read it is supposed to be more efficient because lambdas are the wall street hedge fund managers of the programming function world  Another plus is that it is a one-liner that doesnt require any imports  For example  using the same string  s  defined above         num      join  i for i in s if i isdigit       That s it  The return will be a string of all the characters that are digits in the original string   If you have a specific list of acceptable unacceptable characters you need only adjust the  if  part of the list comprehension         target chars      join  i for i in s if i in some list      or alternatively         target chars      join  i for i in s if i not in some list

User · Answer

usr bin python import re  strs    how  much for   the maple syrup   20 99  That s   ricidulous     print strs nstr   re sub r          a b   r    strs  i have taken special character to remove but any  character can be added here print nstr nestr   re sub r   a-zA-Z0-9    r   nstr  for removing special character print nestr

User · Answer

Using filter  you d just need one line   line   filter lambda char  char not in            line    This treats the string as an iterable and checks every character if the lambda returns True     gt  gt  gt  help filter  Help on built-in function filter in module   builtin     filter          filter function or None  sequence  - gt  list  tuple  or string      Return those items of sequence for which function item  is true   If     function is None  return the items that are true   If sequence is a tuple     or string  return the same type  else return a list

User · Answer

You could use the re module s regular expression replacement  Using the   expression allows you to pick exactly what you want  from your string       import re     text    This is absurd       text   re sub    a-zA-Z      text    Keeps only Alphabets     print text    Output to this would be  Thisisabsurd   Only things specified after the   symbol will appear

User · Answer

Here s some possible ways to achieve this task   def attempt1 string       return    join  v for v in string if v not in   a    e    i    o    u       def attempt2 string       for v in   a    e    i    o    u            string   string replace v          return string   def attempt3 string       import re     for v in   a    e    i    o    u            string   re sub v      string      return string   def attempt4 string       return string replace  a       replace  e       replace  i       replace  o       replace  u         for attempt in  attempt1  attempt2  attempt3  attempt4       print attempt  murcielago      PS  Instead using           the examples use the vowels    and yeah   murcielago  is the Spanish word to say bat    funny word as it contains all the vowels     PS2  If you re interested on performance you could measure these attempts with a simple code like   import timeit   K   1000000 for i in range 1 5       t   timeit Timer          f attempt i   murcielago             setup f from   main   import attempt i         repeat 1  K      print f attempt i   min t     In my box you d get   attempt1 2 2334518376057244 attempt2 1 8806643818474513 attempt3 7 214925774955572 attempt4 1 7271184513757465   So it seems attempt4 is the fastest one for this particular input

User · Answer

gt  gt  gt  s    a1b2c3   gt  gt  gt     join c for c in s if c not in  123    abc

User · Answer

If you want your string to be just allowed characters by using ASCII codes  you can use this piece of code  for char in s      if ord char   lt  96 or ord char   gt  123          s   s replace char   quot  quot    It will remove all the characters beyond a    z even upper cases

User · Answer

Easy peasy with re sub regular expression as of Python 3 5 re sub                             line   Example  gt  gt  gt  import re   gt  gt  gt  line    Q  Do I write        No       gt  gt  gt  re sub                             line   QDoIwriteNo   Explanation In regular expressions  regex     is a logical OR and   escapes spaces and special characters that might be actual regex commands  Whereas sub stands for substitution  in this case with the empty string

User · Answer

Try this one   def rm char original str  need2rm           Remove charecters in  need2rm  from  original str          return original str translate str maketrans       need2rm     This method works well in python 3 5 2

User · Answer

Strings are immutable in Python  The replace method returns a new string after the replacement  Try  for char in line      if char in  quot         quot           line   line replace char      This is identical to your original code  with the addition of an assignment to line inside the loop  Note that the string replace   method replaces all of the occurrences of the character in the string  so you can do better by using replace   for each character you want to remove  instead of looping over each character in your string

User · Answer

line   line translate None

User · Answer

Am I missing the point here  or is it just the following   string    ab1cd1ef  string   string replace  1        print string   result   abcdef    Put it in a loop   a    a b c d   b          for char in b      a   a replace char      print a   result   abcd

User · Answer

for each file on a directory  rename filename     file list   os listdir  r D  Dev Python       for file name in file list          os rename file name  re sub r  d      file name

User · Answer

Below one   with out using regular expression concept     ipstring   text with symbols      amp    ends here  opstring    for i in ipstring      if i isalnum    1 or i               opstring  i     pass print opstring

User · Answer

Even the below approach works  line    a b c d e  alpha   list line          while     in alpha              alpha remove      finalString      join alpha  print finalString    output  abcde

User · Answer

Recursive split  s string   chars chars to remove  def strip s chars   if len s   1      return    if s in chars else s return strip s 0 int len s  2   chars     strip s int len s  2  len s   chars    example    print strip  Hello    lo        He

User · Answer

Strings in Python are immutable  can t be changed    Because of this  the effect of line replace      is just to create a new string  rather than changing the old one   You need to rebind  assign  it to line in order to have that variable take the new value  with those characters removed  Also  the way you are doing it is going to be kind of slow  relatively   It s also likely to be a bit confusing to experienced pythonators  who will see a doubly-nested structure and think for a moment that something more complicated is going on  Starting in Python 2 6 and newer Python 2 x versions    you can instead use str translate   see Python 3 answer below   line   line translate None           or regular expression replacement with re sub import re line   re sub               line   The characters enclosed in brackets constitute a character class   Any characters in line which are in that class are replaced with the second parameter to sub  an empty string  In Python 3  strings are Unicode  You ll have to translate a little differently  kevpie mentions this in a comment on one of the answers  and it s noted in the documentation for str translate  When calling the translate method of a Unicode string  you cannot pass the second parameter that we used above  You also can t pass None as the first parameter  Instead  you pass a translation table  usually a dictionary  as the only parameter  This table maps the ordinal values of characters  i e  the result of calling ord on them  to the ordinal values of the characters which should replace them  or   usefully to us   None to indicate that they should be deleted  So to do the above dance with a Unicode string you would call something like translation table   dict fromkeys map ord           None  unicode line   unicode line translate translation table   Here dict fromkeys and map are used to succinctly generate a dictionary containing  ord       None  ord       None        Even simpler  as another answer puts it  create the translation table in place  unicode line   unicode line translate  ord c   None for c in           Or create the same translation table with str maketrans  unicode line   unicode line translate str maketrans                        for compatibility with earlier Pythons  you can create a  quot null quot  translation table to pass in place of None  import string line   line translate string maketrans                   Here string maketrans is used to create a translation table  which is just a string containing the characters with ordinal values 0 to 255

User · Answer

How about this   def text cleanup text       new          for i in text          if i not in                        new    i     return new

User · Answer

gt  gt  gt  line    abc    efg12      gt  gt  gt     join  c for c in line if  c not in            abc  efg12

User · Answer

The asker almost had it  Like most things in Python  the answer is simpler than you think    gt  gt  gt  line    H E  LL   O        gt  gt  gt  for char in                   line   line replace char            gt  gt  gt  print line HELLO   You don t have to do the nested if for loop thing  but you DO need to check each character individually

User · Answer

you can use set      charlist   list set string digits string ascii uppercase  - set  10IO        return    join  random SystemRandom   choice charlist  for   in range passlen

User · Answer

For the inverse requirement of only allowing certain characters in a string  you can use regular expressions with a set complement operator   ABCabc   For example  to remove everything except ascii letters  digits  and the hyphen    gt  gt  gt  import string  gt  gt  gt  import re  gt  gt  gt   gt  gt  gt  phrase      There were  nine   9  chick-peas in my pocket            gt  gt  gt  allow   string letters   string digits    -   gt  gt  gt  re sub     s     allow      phrase    Therewerenine9chick-peasinmypocket    From the python regular expression documentation      Characters that are not within a range can be matched by complementing   the set  If the first character of the set is      all the characters   that are not in the set will be matched  For example    5  will match   any character except  5   and      will match any character except          has no special meaning if it   s not the first character in the   set

User · Answer

In Python 3 5  e g    os rename file name  file name translate  ord c   None for c in  0123456789       To remove all the number from the string

User · Answer

You can also use a function in order to substitute different kind of regular expression or other pattern with the use of a list  With that  you can mixed regular expression  character class  and really basic text pattern  It s really useful when you need to substitute a lot of elements like HTML ones    NB  works with Python 3 x  import re    Regular expression library   def string cleanup x  notwanted       for item in notwanted          x   re sub item      x      return x  line     lt title gt My example   lt strong gt A text  very   clean   lt  strong gt  lt  title gt   print  Uncleaned     line     Get rid of html elements html elements      lt title gt      lt  title gt      lt strong gt      lt  strong gt    line   string cleanup line  html elements  print  1st clean     line     Get rid of special characters special chars                   line   string cleanup line  special chars  print  2nd clean     line    In the function string cleanup  it takes your string x and your list notwanted as arguments  For each item in that list of elements or pattern  if a substitute is needed it will be done   The output   Uncleaned    lt title gt My example   lt strong gt A text  very   clean   lt  strong gt  lt  title gt  1st clean   My example  A text  very   clean   2nd clean   My example  A text very clean

User · Answer

The string method replace does not modify the original string  It leaves the original alone and returns a modified copy   What you want is something like  line   line replace char      def replace all line   for char in line      if char in                    line   line replace char         return line   However  creating a new string each and every time that a character is removed is very inefficient  I recommend the following instead   def replace all line  baddies                  The following is documentation on how to use the class      without reference to the implementation details       For implementation notes  please see comments begining with         in the source file         crickets chirp                 is bad   lambda ch  baddies baddies  return ch in baddies     filter baddies   lambda ch     is bad is bad     if is bad ch  else ch     mahp   replace all map filter baddies  line      return replace all join     join mahp          -------------------------------------------------       WHY  baddies baddies                is bad is bad        -------------------------------------------------       Default arguments to a lambda function are evaluated       at the same time as when a lambda function is         defined                global variables of a lambda function       are evaluated when the lambda function is         called               The following prints  as yellow as snow                  fleece color    white            little lamb   lambda end  return  as     fleece color   end                   sometime later                    fleece color    yellow            print little lamb   as snow          -------------------------------------------------- replace all map   map replace all join   str join

[python] Remove specific characters from a string in Python

Examples related to python

Examples related to string

Examples related to immutability