How to replace multiple white spaces with one white space

Question

Let s say I have a string such as    Hello     how are   you           doing     I would like a function that turns multiple spaces into one space   So I would get    Hello how are you doing     I know I could use regex or call   string s    Hello     how are   you           doing   replace              But I would have to call it multiple times to make sure all sequential whitespaces are replaced with only one   Is there already a built in method for this

User · Answer

There is no way built in to do this  You can try this   private static readonly char   whitespace   new char            n     t     r     f     v     public static string Normalize string source       return String Join      source Split whitespace  StringSplitOptions RemoveEmptyEntries        This will remove leading and trailing whitespce as well as collapse any internal whitespace to a single whitespace character  If you really only want to collapse spaces  then the solutions using a regular expression are better  otherwise this solution is better   See the analysis done by Jon Skeet

User · Answer

While the existing answers are fine  I d like to point out one approach which doesn t work   public static string DontUseThisToCollapseSpaces string text        while  text IndexOf          -1                text   text Replace                       return text      This can loop forever  Anyone care to guess why   I only came across this when it was asked as a newsgroup question a few years ago    someone actually ran into it as a problem

User · Answer

Smallest solution   var regExp   s  g  newString oldString replace regExp

User · Answer

Here is the Solution i work with  Without RegEx and String Split   public static string TrimWhiteSpace this string Value        StringBuilder sbOut   new StringBuilder        if   string IsNullOrEmpty Value                 bool IsWhiteSpace   false          for  int i   0  i  lt  Value Length  i                          if  char IsWhiteSpace Value i      Comparion with WhiteSpace                               if   IsWhiteSpace    Comparison with previous Char                                       sbOut Append Value i                        IsWhiteSpace   true                                              else                               IsWhiteSpace   false                  sbOut Append Value i                                      return sbOut ToString        so you can   string cleanedString   dirtyString TrimWhiteSpace

User · Answer

Regex regex   new Regex    W     string outputString   regex Replace inputString

User · Answer

Using the test program that Jon Skeet posted  I tried to see if I could get a hand written loop to run faster  I can beat NormalizeWithSplitAndJoin every time  but only beat NormalizeWithRegex with inputs of 1000  5   static string NormalizeWithLoop string input        StringBuilder output   new StringBuilder input Length        char lastChar            anything other then space      for  int i   0  i  lt  input Length  i                  char thisChar   input i           if    lastChar         amp  amp  thisChar                      output Append thisChar            lastChar   thisChar             return output ToString        I have not looked at the machine code the jitter produces  however I expect the problem is the time taken by the call to StringBuilder Append   and to do much better would need the use of unsafe code   So Regex Replace   is very fast and hard to beat

User · Answer

Replacement groups provide impler approach resolving replacement of multiple white space characters with same single one       public static void WhiteSpaceReduce                 string t1    a b   c d           string t2    a b n nc nd            Regex whiteReduce   new Regex      lt firstWS gt  s    lt repeatedWS gt  k lt firstWS gt               Console WriteLine   0    t1             Console WriteLine   0    whiteReduce Replace t1  x   gt  x Value Substring 0  1              Console WriteLine   0    whiteReduce Replace t1      firstWS              Console WriteLine   nNext example ---------            Console WriteLine   0    t2           Console WriteLine   0    whiteReduce Replace t2      firstWS              Console WriteLine            Please notice the second example keeps single  n while accepted answer would replace end of line with space   If you need to replace any combination of white space characters with the first one  just remove the back-reference  k from the pattern

User · Answer

I m sharing what I use  because it appears I ve come up with something different   I ve been using this for a while and it is fast enough for me   I m not sure how it stacks up against the others   I uses it in a delimited file writer and run large datatables one field at a time through it       public static string NormalizeWhiteSpace string S                string s   S Trim            bool iswhite   false          int iwhite          int sLength   s Length          StringBuilder sb   new StringBuilder sLength           foreach char c in s ToCharArray                          if Char IsWhiteSpace c                                 if  iswhite                                          Continuing whitespace ignore it                      continue                                    else                                         New WhiteSpace                        Replace whitespace with a single space                      sb Append                             Set iswhite to True and any following whitespace will be ignored                     iswhite   true                                                else                               sb Append c ToString                       reset iswhitespace to false                 iswhite   false                                  return sb ToString

User · Answer

A fast extra whitespace remover by Felipe Machado   Modified by RW for multi-space removal  static string DuplicateWhiteSpaceRemover string str        var len   str Length      var src   str ToCharArray        int dstIdx   0      bool lastWasWS   false    Added line     for  int i   0  i  lt  len  i                  var ch   src i           switch  ch                        case   u0020     SPACE             case   u00A0     NO-BREAK SPACE             case   u1680     OGHAM SPACE MARK             case   u2000      EN QUAD             case   u2001     EM QUAD             case   u2002     EN SPACE             case   u2003     EM SPACE             case   u2004     THREE-PER-EM SPACE             case   u2005     FOUR-PER-EM SPACE             case   u2006     SIX-PER-EM SPACE             case   u2007     FIGURE SPACE             case   u2008     PUNCTUATION SPACE             case   u2009     THIN SPACE             case   u200A     HAIR SPACE             case   u202F     NARROW NO-BREAK SPACE             case   u205F     MEDIUM MATHEMATICAL SPACE             case   u3000     IDEOGRAPHIC SPACE             case   u2028     LINE SEPARATOR             case   u2029     PARAGRAPH SEPARATOR             case   u0009      ASCII Tab              case   u000A      ASCII Line Feed              case   u000B      ASCII Vertical Tab              case   u000C      ASCII Form Feed              case   u000D      ASCII Carriage Return              case   u0085     NEXT LINE                 if  lastWasWS    false    Added line                                       src dstIdx              Updated by Ryan                     lastWasWS   true    Added line                                   continue              default                  lastWasWS   false    Added line                  src dstIdx      ch                  break                      return new string src  0  dstIdx      The benchmarks                                  Time      TEST 1        TEST 2        TEST 3        TEST 4        TEST 5        Function Name               ticks   dup  spaces   spaces tabs   spaces CR LF   quot   quot  - gt   quot   quot      quot   quot  - gt   quot   quot     --------------------------- ------- ------------- ------------- ------------- ------------- -------------    SwitchStmtBuildSpaceOnly      5 2      PASS          FAIL          FAIL          PASS          PASS         InPlaceCharArraySpaceOnly     5 6      PASS          FAIL          FAIL          PASS          PASS         DuplicateWhiteSpaceRemover    7 0      PASS          PASS          PASS          PASS          PASS         SingleSpacedTrim             11 8      PASS          PASS          PASS          FAIL          FAIL         Fubo StringBuilder             13      PASS          FAIL          FAIL          PASS          PASS         User214147                     19      PASS          PASS          PASS          FAIL          FAIL          RegExWithCompile               28      PASS          FAIL          FAIL          PASS          PASS         SwitchStmtBuild                34      PASS          FAIL          FAIL          PASS          PASS         SplitAndJoinOnSpace            55      PASS          FAIL          FAIL          FAIL          FAIL         RegExNoCompile                120      PASS          PASS          PASS          PASS          PASS         RegExBrandon                  137      PASS          FAIL          PASS          PASS          PASS        Benchmark notes  Release Mode  no-debugger attached  i7 processor  avg of 4 runs  only short strings tested SwitchStmtBuildSpaceOnly  by Felipe Machado 2015 and modified by Sunsetquest InPlaceCharArraySpaceOnly by Felipe Machado 2015 and modified by Sunsetquest SwitchStmtBuild           by Felipe Machado 2015 and modified by Sunsetquest SwitchStmtBuild2          by Felipe Machado 2015 and modified by Sunsetquest SingleSpacedTrim          by David S 2013 Fubo StringBuilder        by fubo 2014 SplitAndJoinOnSpace       by Jon Skeet 2009 RegExWithCompile          by Jon Skeet 2009 User214147                by user214147 RegExBrandon              by Brandon RegExNoCompile            by Tim Hoolihan Benchmark code is on Github

User · Answer

string cleanedString   System Text RegularExpressions Regex Replace dirtyString    s

User · Answer

A regular expressoin would be the easiest way  If you write the regex the correct way  you wont need multiple calls   Change it to this   string s   System Text RegularExpressions Regex Replace s     s 2

User · Answer

You can try this            lt summary gt          Remove all extra spaces and tabs between words in the specified string           lt  summary gt           lt param name  str  gt The specified string  lt  param gt      public static string RemoveExtraSpaces string str                str   str Trim            StringBuilder sb   new StringBuilder            bool space   false          foreach  char c in str                        if  char IsWhiteSpace c     c     char 9    space   true                else   if  space    sb Append          sb Append c   space   false                       return sb ToString

User · Answer

This question isn t as simple as other posters have made it out to be  and as I originally believed it to be  - because the question isn t quite precise as it needs to be   There s a difference between  space  and  whitespace   If you only mean spaces  then you should use a regex of    2     If you mean any whitespace  that s a different matter  Should all whitespace be converted to spaces  What should happen to space at the start and end   For the benchmark below  I ve assumed that you only care about spaces  and you don t want to do anything to single spaces  even at the start and end   Note that correctness is almost always more important than performance  The fact that the Split Join solution removes any leading trailing whitespace  even just single spaces  is incorrect as far as your specified requirements  which may be incomplete  of course    The benchmark uses MiniBench   using System  using System Text RegularExpressions  using MiniBench   internal class Program       public static void Main string   args                 int size   int Parse args 0            int gapBetweenExtraSpaces   int Parse args 1             char   chars   new char size           for  int i 0  i  lt  size 2  i    2                           Make sure there actually  is  something to do             chars i 2     i   gapBetweenExtraSpaces    1           x               chars i 2   1                              Just to make sure we don t have a  0 at the end            for odd sizes         chars chars Length-1     y            string bigString   new string chars              Assume that one form works            string normalized   NormalizeWithSplitAndJoin bigString             var suite   new TestSuite lt string  string gt   Normalize                Plus NormalizeWithSplitAndJoin               Plus NormalizeWithRegex               RunTests bigString  normalized            suite Display ResultColumns All  suite FindBest                private static readonly Regex MultipleSpaces            new Regex     2     RegexOptions Compiled        static string NormalizeWithRegex string input                return MultipleSpaces Replace input                      Guessing as the post doesn t specify what to use     private static readonly char   Whitespace           new char                 static string NormalizeWithSplitAndJoin string input                string   split   input Split              Whitespace  StringSplitOptions RemoveEmptyEntries           return string Join      split             A few test runs   c  Users Jon Test gt test 1000 50              Normalize              NormalizeWithSplitAndJoin  1159091 0 30 258 22 93 NormalizeWithRegex        26378882 0 30 025  1 00  c  Users Jon Test gt test 1000 5              Normalize              NormalizeWithSplitAndJoin  947540 0 30 013 1 07 NormalizeWithRegex        1003862 0 29 610 1 00   c  Users Jon Test gt test 1000 1001              Normalize              NormalizeWithSplitAndJoin  1156299 0 29 898 21 99 NormalizeWithRegex        23243802 0 27 335  1 00   Here the first number is the number of iterations  the second is the time taken  and the third is a scaled score with 1 0 being the best   That shows that in at least some cases  including this one  a regular expression can outperform the Split Join solution  sometimes by a very significant margin   However  if you change to an  all whitespace  requirement  then Split Join does appear to win  As is so often the case  the devil is in the detail

User · Answer

VB NET  Linha Split      ToList   Where Function x  x  lt  gt       ToArray   C   Linha Split      ToList   Where x   gt  x         ToArray      Enjoy the power of LINQ  D

User · Answer

As already pointed out  this is easily done by a regular expression  I ll just add that you might want to add a  trim   to that to get rid of leading trailing whitespace

[c#] How to replace multiple white spaces with one white space

Examples related to c#

Examples related to string

Examples related to whitespace