How to concatenate multiple column values into a single column in Panda dataframe

Question

This question is same to this posted earlier  I want to concatenate three columns instead of concatenating two columns   Here is the combining two columns   df   DataFrame   foo    a   b   c     bar   1  2  3    new    apple    banana    pear      df  combined   df apply lambda x   s  s     x  foo   x  bar    axis 1   df     bar foo new combined 0   1   a   apple   a 1 1   2   b   banana  b 2 2   3   c   pear    c 3   I want to combine three columns with this command but it is not working  any idea   df  combined   df apply lambda x   s  s     x  bar   x  foo   x  new    axis 1

User · Answer

If you have even more columns you want to combine, using the Series method str.cat might be handy:

df["combined"] = df["foo"].str.cat(df[["bar", "new"]].astype(str), sep="_")

Basically, you select the first column (if it is not already of type str, you need to append .astype(str)), to which you append the other columns (separated by an optional separator character).

User · Answer

The answer given by  allen is reasonably generic but can lack in performance for larger dataframes   Reduce does a lot better   from functools import reduce  import pandas as pd    make data df   pd DataFrame index range 1 000 000   df  1      CO  df  2      BOB  df  3      01  df  4      BILL    def reduce join df  columns       assert len columns   gt  1     slist    df x  astype str  for x in columns      return reduce lambda x  y  x         y  slist 1    slist 0     def apply join df  columns       assert len columns   gt  1     return df columns  apply lambda row     join row values astype str    axis 1     ensure outputs are equal df1   reduce join df  list  1234    df2   apply join df  list  1234    assert df1 equals df2     profile  timeit df1   reduce join df  list  1234       733 ms  timeit df2   apply join df  list  1234        8 84 s

User · Answer

df  New column name     df  Column1   map str     X    df  Steps     X  x is any delimiter  eg  space  by which you want to separate two merged column

User · Answer

df   DataFrame   foo    a   b   c     bar   1  2  3    new    apple    banana    pear      df  combined     df  foo   astype str      df  bar   astype str    If you concatenate with string      please you convert the column to string which you want and after you can concatenate the dataframe

User · Answer

Another solution using DataFrame apply    with slightly less typing and more scalable when you want to join more columns   cols     foo    bar    new   df  combined     df cols  apply lambda row      join row values astype str    axis 1

User · Answer

you can simply do   In 17  df  combined   df  bar   astype str      df  foo       df  new    In 17  df Out 18       bar foo     new    combined 0    1   a   apple   1 a apple 1    2   b  banana  2 b banana 2    3   c    pear    3 c pear

User · Answer

derchambers I found one more solution   import pandas as pd    make data df   pd DataFrame index range 1 000 000   df  1      CO  df  2      BOB  df  3      01  df  4      BILL   def eval join df  columns        sum elements    f df   col     for col in list  1234        to eval              join sum elements       return eval to eval     profile  timeit df3   eval join df  list  1234      504 ms

User · Answer

If you have a list of columns you want to concatenate and maybe you d like to use some separator  here s what you can do def concat columns df  cols to concat  new col name  sep  quot   quot        df new col name    df cols to concat 0       for col in cols to concat 1            df new col name    df new col name  astype str    sep   df col  astype str    This should be faster than apply and takes an arbitrary number of columns to concatenate

User · Answer

Just wanted to make a time comparison for both solutions  for 30K rows DF    In  1   df   DataFrame   foo    a   b   c     bar   1  2  3    new    apple    banana    pear      In  2   big   pd concat  df    10  4  ignore index True   In  3   big shape Out 3    30000  3   In  4    timeit big apply lambda x   s  s  s     x  bar   x  foo   x  new    axis 1  1 loop  best of 3  881 ms per loop  In  5    timeit big  bar   astype str      big  foo       big  new   10 loops  best of 3  44 2 ms per loop   a few more options   In  6    timeit big ix     -1  astype str  add      sum axis 1  str cat big new  10 loops  best of 3  72 2 ms per loop  In  11    timeit big astype str  add      sum axis 1  str  -1  10 loops  best of 3  82 3 ms per loop

User · Answer

I think you are missing one  s  df  combined   df apply lambda x   s  s  s     x  bar   x  foo   x  new    axis 1

User · Answer

Possibly the fastest solution is to operate in plain Python   Series      map              join          df values tolist             when non-string columns are present            df values astype str  tolist              index df index     Comparison against  MaxU answer  using the big data frame which has both numeric and string columns     timeit big  bar   astype str          big  foo           big  new     29 4 ms    1 08 ms per loop  mean    std  dev  of 7 runs  10 loops each     timeit Series map     join  big values astype str  tolist     index big index    27 4 ms    2 36 ms per loop  mean    std  dev  of 7 runs  10 loops each    Comparison against  derchambers answer  using their df data frame where all columns are strings    from functools import reduce  def reduce join df  columns       slist    df x  for x in columns      return reduce lambda x  y  x         y  slist 1    slist 0    def list map df  columns       return Series          map                  join              df columns  values tolist                      index df index         timeit df1   reduce join df  list  1234      602 ms    39 ms per loop  mean    std  dev  of 7 runs  1 loop each    timeit df2   list map df  list  1234      351 ms    12 1 ms per loop  mean    std  dev  of 7 runs  1 loop each

[python] How to concatenate multiple column values into a single column in Panda dataframe

Examples related to python

Examples related to pandas

Examples related to dataframe