Is there a simple way to remove multiple spaces in a string

Question

Suppose this string   The   fox jumped   over    the log    Turning into   The fox jumped over the log    What is the simplest  1-2 lines  to achieve this  without splitting and going into lists

User · Answer

In some cases it s desirable to replace consecutive occurrences of every whitespace character with a single instance of that character  You d use a regular expression with backreferences to do that     s  1 1   matches any whitespace character  followed by one or more occurrences of that character  Now  all you need to do is specify the first group   1  as the replacement for the match   Wrapping this in a function   import re  def normalize whitespace string       return re sub r   s  1 1     r  1   string       gt  gt  gt  normalize whitespace  The   fox jumped   over    the log     The fox jumped over the log    gt  gt  gt  normalize whitespace  First    line t t t  n n nSecond    line    First line t  nSecond line

User · Answer

Quite surprising - no one posted simple function which will be much faster than ALL other posted solutions  Here it goes  def compactSpaces s       os    quot  quot      for c in s          if c     quot   quot  or  os and os -1      quot   quot                os    c      return os

User · Answer

gt  gt  gt  import re  gt  gt  gt  re sub             The     quick brown    fox    The quick brown fox

User · Answer

If it s whitespace you re dealing with  splitting on None will not include an empty string in the returned value   5 6 1  String Methods  str split

User · Answer

import re string   re sub     t n           The     quick brown                 n n              t        fox     This will remove all the tabs  new lines and multiple white spaces with single white space

User · Answer

import re s    The   fox jumped   over    the log   re sub   s s          s    or  re sub   s s         s    since the space before comma is listed as a pet peeve in PEP nbsp 8  as mentioned by user Martin Thoma in the comments

User · Answer

This also seems to work   while      in s      s   s replace              Where the variable s represents your string

User · Answer

I haven t read a lot into the other examples  but I have just created this method for consolidating multiple consecutive space characters   It does not use any libraries  and whilst it is relatively long in terms of script length  it is not a complex implementation   def spaceMatcher command               Function defined to consolidate multiple whitespace characters in     strings to a single space               Initiate index to flag if more than one consecutive character     iteration     space match   0     space char          for char in command        if char                   space match    1           space char              elif  char          amp   space match  gt  1             new command   command replace space char                 space match   0           space char            elif char                   space match   0           space char         return new command  command   None command   str input  Please enter a command - gt     print spaceMatcher command   print list spaceMatcher command

User · Answer

I have my simple method which I have used in college   line    I     have            a       nice    day    end   1000 while end    0      line replace                end -  1   This will replace every double space with a single space and will do it 1000 times  It means you can have 2000 extra spaces and will still work

User · Answer

def unPretty S        Given a dictionary  JSON  list  float  int  or even a string         return a string stripped of CR  LF replaced by space  with multiple spaces reduced to one     return     join str S  replace   n        replace   r       split

User · Answer

The fastest you can get for user-generated strings is   if      in text      while      in text          text   text replace              The short circuiting makes it slightly faster than pythonlarry s comprehensive answer  Go for this if you re after efficiency and are strictly looking to weed out extra whitespaces of the single space variety

User · Answer

Because  pythonlarry asked here are the missing generator based versions The groupby join is easy  Groupby will group elements consecutive with same key  And return pairs of keys and list of elements for each group  So when the key is an space an space is returne else the entire group  from itertools import groupby def group join string     return    join     if chr      else    join times  for chr times in groupby string    The group by variant is simple but very slow  So now for the generator variant  Here we consume an iterator  the string  and yield all chars except chars that follow an char  def generator join generator string     last False   for c in string      if c             if not last          last True         yield         else        last False     yield c  def generator join string     return    join generator join generator string    So i meassured the timings with some other lorem ipsum   while replace 0 015868543065153062 re replace 0 22579886706080288 proper join 0 40058281796518713 group join 5 53206754301209 generator join 1 6673167790286243  With Hello and World separated by 64KB of spaces  while replace 2 991308711003512 re replace 0 08232860406860709 proper join 6 294375243945979 group join 2 4320066600339487 generator join 6 329648651066236  Not forget the original sentence  while replace 0 002160938922315836 re replace 0 008620491018518806 proper join 0 005650000995956361 group join 0 028368217987008393 generator join 0 009435956948436797  Interesting here for nearly space only strings group join is not that worse Timing showing always median from seven runs of a thousand times each

User · Answer

One line of code to remove all extra spaces before  after  and within a sentence    sentence      The   fox jumped   over    the log     sentence       join filter None sentence split          Explanation    Split the entire string into a list  Filter empty elements from the list  Rejoin the remaining elements  with a single space     The remaining elements should be words or words with punctuations  etc  I did not test this extensively  but this should be a good starting point  All the best

User · Answer

sentence    quot The   fox jumped   over    the log  quot  word   sentence split   result    quot  quot  for string in word     result    string  quot   quot  print result

User · Answer

foo is your string       join foo split      Be warned though this removes  all whitespace characters  space  tab  newline  return  formfeed    thanks to hhsaffar  see comments   I e    this is   t a test n  will effectively end up as  this is a test

User · Answer

This does and will do       python    3 x import operator       line  line of text return  quot   quot  join filter lambda a  operator is not a   quot  quot    line strip   split  quot   quot

User · Answer

quot   quot  join foo split    is not quite correct with respect to the question asked because it also entirely removes single leading and or trailing white spaces  So  if they shall also be replaced by 1 blank  you should do something like the following   quot   quot  join        foo        split     1 -1   Of course  it s less elegant

User · Answer

Using regexes with   s  and doing simple string split   s will also remove other whitespace - like newlines  carriage returns  tabs   Unless this is desired  to only do multiple spaces  I present these examples   I used 11 paragraphs  1000 words  6665 bytes of Lorem Ipsum to get realistic time tests and used random-length extra spaces throughout   original string      join word          random randint 1  10   for word in lorem ipsum split         The one-liner will essentially do a strip of any leading trailing spaces  and it preserves a leading trailing space  but only ONE  -      setup        import re  def while replace string       while      in string          string   string replace                 return string  def re replace string       return re sub r   2           string   def proper join string       split string   string split             To account for leading trailing spaces that would simply be removed     beg       if not split string  0  else        end       if not split string -1  else           versus simply     join item for item in string split      if item      return beg       join item for item in split string if item    end  original string      Lorem    ipsum            no  really  it kept going             malesuada enim feugiat          Integer imperdiet    erat      assert while replace original string     re replace original string     proper join original string              while replace test new string   original string     new string   while replace new string   assert new string    original string       re replace test new string   original string     new string   re replace new string   assert new string    original string       proper join test new string   original string     new string   proper join new string   assert new string    original string   NOTE  The  while version  made a copy of the original string  as I believe once modified on the first run  successive runs would be faster  if only by a bit    As this adds time  I added this string copy to the other two so that the times showed the difference only in the logic   Keep in mind that the main stmt on timeit instances will only be executed once  the original way I did this  the while loop worked on the same label  original string  thus the second run  there would be nothing to do   The way it s set up now  calling a function  using two different labels  that isn t a problem   I ve added assert statements to all the workers to verify we change something every iteration  for those who may be dubious   E g   change to this and it breaks     while replace test new string   original string     new string   while replace new string   assert new string    original string   will break the 2nd iteration  while      in original string      original string   original string replace                Tests run on a laptop with an i5 processor running Windows 7  64-bit    timeit Timer stmt   test  setup   setup  repeat 7  1000   test string    The   fox jumped   over n t    the log     trivial  Python 2 7 3  32-bit  Windows                 test        minum      maximum      average       median --------------------- ------------ ------------ ------------ -----------   while replace test     0 001066     0 001260     0 001128     0 001092      re replace test     0 003074     0 003941     0 003357     0 003349     proper join test     0 002783     0 004829     0 003554     0 003035  Python 2 7 3  64-bit  Windows                 test        minum      maximum      average       median --------------------- ------------ ------------ ------------ -----------   while replace test     0 001025     0 001079     0 001052     0 001051      re replace test     0 003213     0 004512     0 003656     0 003504     proper join test     0 002760     0 006361     0 004626     0 004600  Python 3 2 3  32-bit  Windows                 test        minum      maximum      average       median --------------------- ------------ ------------ ------------ -----------   while replace test     0 001350     0 002302     0 001639     0 001357      re replace test     0 006797     0 008107     0 007319     0 007440     proper join test     0 002863     0 003356     0 003026     0 002975  Python 3 3 3  64-bit  Windows                 test        minum      maximum      average       median --------------------- ------------ ------------ ------------ -----------   while replace test     0 001444     0 001490     0 001460     0 001459      re replace test     0 011771     0 012598     0 012082     0 011910     proper join test     0 003741     0 005933     0 004341     0 004009     test string   lorem ipsum   Thanks to http   www lipsum com     Generated 11 paragraphs  1000 words  6665 bytes of Lorem Ipsum   Python 2 7 3  32-bit                 test        minum      maximum      average       median --------------------- ------------ ------------ ------------ -----------   while replace test     0 342602     0 387803     0 359319     0 356284      re replace test     0 337571     0 359821     0 348876     0 348006     proper join test     0 381654     0 395349     0 388304     0 388193      Python 2 7 3  64-bit                 test        minum      maximum      average       median --------------------- ------------ ------------ ------------ -----------   while replace test     0 227471     0 268340     0 240884     0 236776      re replace test     0 301516     0 325730     0 308626     0 307852     proper join test     0 358766     0 383736     0 370958     0 371866      Python 3 2 3  32-bit                 test        minum      maximum      average       median --------------------- ------------ ------------ ------------ -----------   while replace test     0 438480     0 463380     0 447953     0 446646      re replace test     0 463729     0 490947     0 472496     0 468778     proper join test     0 397022     0 427817     0 406612     0 402053      Python 3 3 3  64-bit                 test        minum      maximum      average       median --------------------- ------------ ------------ ------------ -----------   while replace test     0 284495     0 294025     0 288735     0 289153      re replace test     0 501351     0 525673     0 511347     0 508467     proper join test     0 422011     0 448736     0 436196     0 440318   For the trivial string  it would seem that a while-loop is the fastest  followed by the Pythonic string-split join  and regex pulling up the rear   For non-trivial strings  seems there s a bit more to consider   32-bit 2 7   It s regex to the rescue   2 7 64-bit   A while loop is best  by a decent margin   32-bit 3 2  go with the  proper  join   64-bit 3 3  go for a while loop   Again   In the end  one can improve performance if where when needed  but it s always best to remember the mantra    Make It Work Make It Right Make It Fast   IANAL  YMMV  Caveat Emptor

User · Answer

I ve got a simple method without splitting   a    Lorem   Ipsum Darum     Diesrum   while True      count   a find           if count  gt  0          a   a replace                    count   a find               continue     else          break  print a

User · Answer

To remove white space  considering leading  trailing and extra white space in between words  use      lt   s           s          n 0     The first or deals with leading white space  the second or deals with start of string leading white space  and the last one deals with trailing white space   For proof of use  this link will provide you with a test   https   regex101 com r meBYli 4  This is to be used with the re split function

User · Answer

I have to agree with Paul McGuire s comment  To me       join the string split      is vastly preferable to whipping out a regex   My measurements  Linux and Python 2 5  show the split-then-join to be almost five times faster than doing the  re sub        and still three times faster if you precompile the regex once and do the operation multiple times  And it is by any measure easier to understand -- much more Pythonic

User · Answer

Solution for Python developers   import re  text1    Python      Exercises    Are   Challenging Exercises  print  Original string     text1  print  Without extra spaces     re sub            text1     Output   Original string  Python      Exercises    Are   Challenging Exercises  Without extra spaces  Python Exercises Are Challenging Exercises

User · Answer

import re  Text     You can select below trims for removing white space     BR Aliakbar           trims all white spaces print  Remove all space   re sub r  s        Text   sep        trims left space print  Remove leading space    re sub r   s        Text   sep        trims right space print  Remove trailing spaces    re sub r  s         Text   sep         trims both print  Remove leading and trailing spaces    re sub r   s   s         Text   sep       replace more than one white space in the string with one white space print  Remove more than one space   re sub           Text   sep        Result   Remove all space Youcanselectbelowtrimsforremovingwhitespace  BRAliakbar Remove leading space You can select below trims for removing white space     BR Aliakbar Remove trailing spaces  You can select below trims for removing white space     BR Aliakbar Remove leading and trailing spaces You can select below trims for removing white space     BR Aliakbar Remove more than one space  You can select below trims for removing white space   BR Aliakbar

User · Answer

Another alternative    gt  gt  gt  import re  gt  gt  gt  str    this is a            string with    multiple spaces and    tabs   gt  gt  gt  str   re sub     t           str   gt  gt  gt  print str this is a string with multiple spaces and tabs

User · Answer

I have tried the following method and it even works with the extreme case like   str1            I   live    on    earth                  join str1 split      But if you prefer a regular expression it can be done as   re sub   s         str1    Although some preprocessing has to be done in order to remove the trailing and ending space

User · Answer

You can also use the string splitting technique in a Pandas DataFrame without needing to use  apply      which is useful if you need to perform the operation quickly on a large number of strings  Here it is on one line   df  message      df  message   str split    str join

User · Answer

A simple soultion   gt  gt  gt  import re  gt  gt  gt  s  The   fox jumped   over    the log    gt  gt  gt  print re sub   s        s  The fox jumped over the log

User · Answer

string    This is a             string full of spaces          and taps  string   string split      while    in string      string remove     string       join string  print string    Results      This is a string full of spaces and taps

User · Answer

Similar to the previous solutions  but more specific  replace two or more spaces with one    gt  gt  gt  import re  gt  gt  gt  s    The   fox jumped   over    the log    gt  gt  gt  re sub   s 2          s   The fox jumped over the log

[python] Is there a simple way to remove multiple spaces in a string?

Examples related to python

Examples related to regex

Examples related to string