Creating a new column based on if-elif-else condition

Question

I have a DataFrame df       A    B a   2    2  b   3    1 c   1    3   I want to create a new column based on the following criteria   if row A    B  0  if rowA  gt  B  1  if row A  lt  B  -1   so given the above table  it should be       A    B    C a   2    2    0 b   3    1    1 c   1    3   -1    For typical if else cases I do np where df A  gt  df B  1  -1   does pandas provide a special syntax for solving my problem with one step  without the necessity of creating 3 new columns and then combining the result

User · Answer

For this particular relationship  you could use np sign    gt  gt  gt  df  C     np sign df A - df B   gt  gt  gt  df    A  B  C a  2  2  0 b  3  1  1 c  1  3 -1

User · Answer

When you have multiple if conditions  numpy select is the way to go  In  4102   import numpy as np In  4098   conditions    df A eq df B   df A gt df B   df A lt df B   In  4096   choices    0  1  -1   In  4100   df  C     np select conditions  choices   In  4101   df Out 4101       A  B  C a  2  2  0 b  3  1  1 c  1  3 -1

User · Answer

Lets say above one is your original dataframe and you want to add a new column  old    If age greater than 50 then we consider as older yes otherwise False  step 1  Get the indexes of rows whose age greater than 50      row indexes df df  age   gt  50  index   step 2   Using  loc we can assign a new value to column      df loc row indexes  elderly    yes   same for age below less than 50      row indexes df df  age   lt 50  index      df row indexes  elderly    no

User · Answer

To formalize some of the approaches laid out above  Create a function that operates on the rows of your dataframe like so  def f row       if row  A      row  B            val   0     elif row  A    gt  row  B            val   1     else          val   -1     return val  Then apply it to your dataframe passing in the axis 1 option  In  1   df  C     df apply f  axis 1   In  2   df Out 2      A  B  C a  2  2  0 b  3  1  1 c  1  3 -1  Of course  this is not vectorized so performance may not be as good when scaled to a large number of records   Still  I think it is much more readable  Especially coming from a SAS background  Edit Here is the vectorized version df  C     np where      df  A      df  B    0  np where      df  A    gt   df  B    1  -1

User · Answer

df loc df  A      df  B     C     0 df loc df  A    gt  df  B     C     1 df loc df  A    lt  df  B     C     -1   Easy to solve using indexing  The first line of code reads like so  if column A is equal to column B then create and set column C equal to 0

[python] Creating a new column based on if-elif-else condition

Examples related to python

Examples related to pandas

Examples related to conditional