Substitute multiple whitespace with single whitespace in Python

Question

I have this string   mystring    Here is  some   text   I      wrote       How can I substitute the double  triple       whitespace chracters with a single space  so that I get   mystring    Here is some text I wrote

User · Accepted Answer

A simple possibility  if you d rather avoid REs  is      join mystring split      The split and join perform the task you re explicitly asking about -- plus  they also do the extra one that you don t talk about but is seen in your example  removing trailing spaces -

User · Answer

A regular expression can be used to offer more control over the whitespace characters that are combined   To match unicode whitespace   import re   RE COMBINE WHITESPACE   re compile r  s     my str    RE COMBINE WHITESPACE sub      my str  strip     To match ASCII whitespace only   import re   RE COMBINE WHITESPACE   re compile r   a  s      RE STRIP WHITESPACE   re compile r   a   s   s       my str    RE COMBINE WHITESPACE sub      my str  my str    RE STRIP WHITESPACE sub     my str    Matching only ASCII whitespace is sometimes essential for keeping control characters such as x0b  x0c  x1c  x1d  x1e  x1f   Reference   About  s      For Unicode  str  patterns    Matches Unicode whitespace characters  which includes    t n r f v   and also many other characters  for example the   non-breaking spaces mandated by typography rules in many languages     If the ASCII flag is used  only    t n r f v  is matched    About re ASCII       Make  w   W   b   B   d   D   s and  S perform ASCII-only matching instead of full Unicode matching  This is only meaningful for Unicode   patterns  and is ignored for byte patterns  Corresponds to the inline   flag   a     strip   will remote any leading and trailing whitespaces

User · Answer

For completeness  you can also use   mystring   mystring strip      the while loop will leave a trailing space                       so the trailing whitespace must be dealt with                     before or after the while loop while      in mystring      mystring   mystring replace              which will work quickly on strings with relatively few spaces  faster than re in these situations     In any scenario  Alex Martelli s split join solution performs at least as quickly  usually significantly more so    In your example  using the default values of timeit Timer repeat    I get the following times   str replace   1 4317800167340238  1 4174888149192384  1 4163512401715934  re sub        3 741931446594549   3 8389395858970374  3 973777672860706  split join    0 6530919432498195  0 6252146571700905  0 6346594329726258     EDIT   Just came across this post which provides a rather long comparison of the speeds of these methods

[python] Substitute multiple whitespace with single whitespace in Python

Examples related to python

Examples related to substitution

Examples related to removing-whitespace