How can I trim leading and trailing white space

Question

I am having some troubles with leading and trailing white space in a data frame  For example  I like to take a look at a specific row in a data frame based on a certain condition     gt  myDummy myDummy country    c  quot Austria quot   c 1 2 3 7 19        1  codeHelper     country        dummyLI    dummyLMI       dummyUMI          6  dummyHInonOECD dummyHIOECD    dummyOECD         lt 0 rows gt   or 0-length row names   I was wondering why I didn t get the expected output since the country Austria obviously existed in my data frame  After looking through my code history and trying to figure out what went wrong I tried   gt  myDummy myDummy country    c  quot Austria  quot   c 1 2 3 7 19      codeHelper  country dummyLI dummyLMI dummyUMI dummyHInonOECD dummyHIOECD 18        AUT Austria        0        0        0              0           1    dummyOECD 18         1  All I have changed in the command is an additional white space after Austria  Further annoying problems obviously arise  For example  when I like to merge two frames based on the country column  One data frame uses  quot Austria  quot  while the other frame has  quot Austria quot   The matching doesn t work   Is there a nice way to  show  the white space on my screen so that I am aware of the problem  And can I remove the leading and trailing white space in R   So far I used to write a simple Perl script which removes the whites pace  but it would be nice if I can somehow do it inside R

User · Answer

As of R 3 2 0 a new function was introduced for removing leading trailing white spaces  trimws    See  Remove Leading Trailing Whitespace

User · Answer

Use grep or grepl to find observations with white spaces and sub to get rid of them  names lt -c  quot Ganga Din t quot    quot Shyam Lal quot    quot Bulbul  quot   grep  quot    space      quot   names   1  1 3 grepl  quot    space      quot   names   1   TRUE FALSE  TRUE sub  quot    space      quot    quot  quot   names   1   quot Ganga Din quot   quot Shyam Lal quot   quot Bulbul quot

User · Answer

I created a trim strings    function to trim leading and or trailing whitespace as     Arguments     x - character vector              side - side s  on which to remove whitespace                      default    both                      possible values  c  both    leading    trailing    trim strings  lt - function x  side    both          if  is na match side  c  both    leading    trailing               side  lt -  both                if  side     leading            sub     s        x          else           if  side     trailing               sub    s         x        else gsub     s    s         x              For illustration    a  lt - c     ABC123 456         ABC123DEF                returns string without leading and trailing whitespace trim strings a     1   ABC123 456   ABC123DEF      returns string without leading whitespace trim strings a  side    leading      1   ABC123 456            ABC123DEF               returns string without trailing whitespace trim strings a  side    trailing      1      ABC123 456    ABC123DEF

User · Answer

A simple function to remove leading and trailing whitespace   trim  lt - function  x       gsub       space        space             x      Usage    gt  text       foo bar  baz 3    gt  trim text   1   foo bar  baz 3

User · Answer

Ad 1  To see white spaces you could directly call print data frame with modified arguments  print head iris   quote TRUE      Sepal Length Sepal Width Petal Length Petal Width  Species   1         quot 5 1 quot         quot 3 5 quot          quot 1 4 quot         quot 0 2 quot   quot setosa quot    2         quot 4 9 quot         quot 3 0 quot          quot 1 4 quot         quot 0 2 quot   quot setosa quot    3         quot 4 7 quot         quot 3 2 quot          quot 1 3 quot         quot 0 2 quot   quot setosa quot    4         quot 4 6 quot         quot 3 1 quot          quot 1 5 quot         quot 0 2 quot   quot setosa quot    5         quot 5 0 quot         quot 3 6 quot          quot 1 4 quot         quot 0 2 quot   quot setosa quot    6         quot 5 4 quot         quot 3 9 quot          quot 1 7 quot         quot 0 4 quot   quot setosa quot   See also  print data frame for other options

User · Answer

Another related problem occurs if you have multiple spaces in between inputs   gt  a  lt -  quot   a string         with lots   of starting  inter   mediate and trailing   whitespace      quot   You can then easily split this string into  quot real quot  tokens using a regular expression to the split argument   gt  strsplit a  split  quot    quot     1     1   quot  quot             quot a quot            quot string quot       quot with quot         quot lots quot    6   quot of quot           quot starting  quot    quot inter quot        quot mediate quot      quot and quot   11   quot trailing quot     quot whitespace quot   Note that if there is a match at the beginning of a  non-empty  string  the first element of the output is     quot  quot      but if there is a match at the end of the string  the output is the same as with the match removed

User · Answer

Removing leading and trailing blanks might be achieved through the trim   function from the gdata package as well  require gdata  example trim   Usage example   gt  trim  quot    Remove leading and trailing blanks     quot    1   quot Remove leading and trailing blanks quot   I d prefer to add the answer as comment to user56 s  but I am yet unable so writing as an independent answer

User · Answer

Another option is to use the stri trim function from the stringi package which defaults to removing leading and trailing whitespace    gt  x  lt - c    leading space   trailing space       gt  stri trim x   1   leading space    trailing space    For only removing leading whitespace  use stri trim left  For only removing trailing whitespace  use stri trim right  When you want to remove other leading or trailing characters  you have to specify that with pattern     See also  stri trim for more info

User · Answer

I tried trim    It works well with white spaces as well as the   n   x     n              Harden  J  n                 trim x

User · Answer

To manipulate the white space  use str trim   in the stringr package  The package has manual dated Feb 15  2013 and is in CRAN  The function can also handle string vectors  install packages  quot stringr quot   dependencies TRUE  require stringr  example str trim  d4 clean2 lt -str trim d4 V2    Credit goes to commenter  R  Cotton

User · Answer

The best method is trimws    The following code will apply this function to the entire dataframe  mydataframe lt - data frame lapply mydataframe  trimws  stringsAsFactors   FALSE

User · Answer

Probably the best way is to handle the trailing white spaces when you read your data file  If you use read csv or read table you can set the parameterstrip white TRUE  If you want to clean strings afterwards you could use one of these functions    Returns string without leading white space trim leading  lt - function  x   sub  quot    s  quot    quot  quot   x     Returns string without trailing white space trim trailing  lt - function  x  sub  quot   s   quot    quot  quot   x     Returns string without leading or trailing white space trim  lt - function  x  gsub  quot    s    s   quot    quot  quot   x   To use one of these functions on myDummy country   myDummy country  lt - trim myDummy country    To  show  the white space you could use   paste myDummy country   which will show you the strings surrounded by quotation marks   quot   making white spaces easier to spot

User · Answer

myDummy myDummy country     quot Austria  quot    lt -  quot Austria quot   After this  you ll need to force R not to recognize  quot Austria  quot  as a level  Let s pretend you also have  quot USA quot  and  quot Spain quot  as levels  myDummy country   factor myDummy country  levels c  quot Austria quot    quot USA quot    quot Spain quot     It is a little less intimidating than the highest voted response  but it should still work

[r] How can I trim leading and trailing white space?

Examples related to r

Examples related to whitespace

Examples related to trim

Examples related to removing-whitespace

Examples related to r-faq