Combine two or more columns in a dataframe into a new column with a new name

Question

For example if I have this   n   c 2  3  5   s   c  aa    bb    cc    b   c TRUE  FALSE  TRUE   df   data frame n  s  b     n  s     b 1 2 aa  TRUE 2 3 bb FALSE 3 5 cc  TRUE   Then how do I combine the two columns n and s into a new column named x such that it looks like this     n  s     b     x 1 2 aa  TRUE  2 aa 2 3 bb FALSE  3 bb 3 5 cc  TRUE  5 cc

User · Answer

Some examples with NAs and their removal using apply  n   c 2  NA  NA   s   c  aa    bb   NA   b   c TRUE  FALSE  NA   c   c 2  3  5   d   c  aa   NA   cc    e   c TRUE  NA  TRUE   df   data frame n  s  b  c  d  e   paste noNA  lt - function x sep         gsub       sep  toString x  is na x   amp  x      amp  x   NA          sep     df x  lt - apply  df    c 1 6      1   paste noNA   sep sep  df

User · Answer

Use  paste    df x  lt - paste df n df s   df     n  s     b    x   1 2 aa  TRUE 2 aa   2 3 bb FALSE 3 bb   3 5 cc  TRUE 5 cc

User · Answer

For inserting a separator   df x  lt - paste df n   -   df s

User · Answer

We can use paste0    df combField  lt - paste0 df x  df y    If you do not want any padding space introduced in the concatenated field  This is more useful if you are planning to use the combined field as a unique id that represents combinations of two fields

User · Answer

Using dplyr  mutate   library dplyr  df  lt - mutate df  x   paste n  s     df   gt  df   n  s     b    x 1 2 aa  TRUE 2 aa 2 3 bb FALSE 3 bb 3 5 cc  TRUE 5 cc

User · Answer

There are other great answers  but in the case where you don t know the column names or the number of columns you want to concatenate beforehand  the following is useful   df   data frame x   letters 1 5   y   letters 6 10   z   letters 11 15   colNames   colnames df    could be any number of column names here df newColumn   apply df   colNames  drop   F   MARGIN   1  FUN   function i  paste i  collapse

User · Answer

As already mentioned in comments by Uwe and UseR  a general solution in the tidyverse format would be to use the command unite   library tidyverse   n   c 2  3  5   s   c  aa    bb    cc    b   c TRUE  FALSE  TRUE    df   data frame n  s  b    gt      unite x  c n  s   sep        remove   FALSE

User · Answer

Instead of    paste  default spaces    paste0  force the inclusion of missing NA as character  or  unite  constrained to 2 columns and 1 separator      I d suggest an alternative as flexible as paste0 but more careful with NA  stringr  str c  library tidyverse     check the missing value   df  lt - tibble    n   c 2  2  8     s   c  aa    aa   NA character      b   c TRUE  FALSE  TRUE     df   gt      mutate      paste   paste n  -  s     b       paste0   paste0 n  -  s     b       str c   str c n  -  s     b        gt         convert missing value to      mutate      s 2 str replace na s replacement             gt      mutate      str c 2   str c n  -  s 2     b        gt    A tibble  3 x 8   gt        n s     b     paste          paste0     str c      s 2   str c 2      gt     lt dbl gt   lt chr gt   lt lgl gt   lt chr gt            lt chr gt        lt chr gt        lt chr gt   lt chr gt         gt  1     2 aa    TRUE  2 - aa   TRUE  2-aa TRUE  2-aa TRUE   aa   2-aa TRUE    gt  2     2 aa    FALSE 2 - aa   FALSE 2-aa FALSE 2-aa FALSE  aa   2-aa FALSE   gt  3     8  lt NA gt   TRUE  8 - NA   TRUE  8-NA TRUE   lt NA gt              8- TRUE   Created on 2020-04-10 by the reprex package  v0 3 0   extra note from str c documentation     Like most other R functions  missing values are  infectious   whenever a missing value is combined with another string the result will always be missing  Use str replace na   to convert NA to  NA

[r] Combine two or more columns in a dataframe into a new column with a new name

Examples related to r

Examples related to dataframe

Examples related to multiple-columns

Examples related to r-faq