[python] Compare two columns using pandas

Using this as a starting point:

a = [['10', '1.2', '4.2'], ['15', '70', '0.03'], ['8', '5', '0']]
df = pd.DataFrame(a, columns=['one', 'two', 'three'])

Out[8]: 
  one  two three
0   10  1.2   4.2
1   15  70   0.03
2    8   5     0

I want to use something like an if statement within pandas.

if df['one'] >= df['two'] and df['one'] <= df['three']:
    df['que'] = df['one']

Basically, check each row via the if statement, create new column.

The docs say to use .all but there is no example...

This question is related to python pandas if-statement dataframe

The answer is


One way is to use a Boolean series to index the column df['one']. This gives you a new column where the True entries have the same value as the same row as df['one'] and the False values are NaN.

The Boolean series is just given by your if statement (although it is necessary to use & instead of and):

>>> df['que'] = df['one'][(df['one'] >= df['two']) & (df['one'] <= df['three'])]
>>> df
    one two three   que
0   10  1.2 4.2      10
1   15  70  0.03    NaN
2   8   5   0       NaN

If you want the NaN values to be replaced by other values, you can use the fillna method on the new column que. I've used 0 instead of the empty string here:

>>> df['que'] = df['que'].fillna(0)
>>> df
    one two three   que
0   10  1.2   4.2    10
1   15   70  0.03     0
2    8    5     0     0

Use np.select if you have multiple conditions to be checked from the dataframe and output a specific choice in a different column

conditions=[(condition1),(condition2)]
choices=["choice1","chocie2"]

df["new column"]=np.select=(condtion,choice,default=)

Note: No of conditions and no of choices should match, repeat text in choice if for two different conditions you have same choices


I think the closest to the OP's intuition is an inline if statement:

df['que'] = (df['one'] if ((df['one'] >= df['two']) and (df['one'] <= df['three'])) 

Use lambda expression:

df[df.apply(lambda x: x['col1'] != x['col2'], axis = 1)]

Wrap each individual condition in parentheses, and then use the & operator to combine the conditions:

df.loc[(df['one'] >= df['two']) & (df['one'] <= df['three']), 'que'] = df['one']

You can fill the non-matching rows by just using ~ (the "not" operator) to invert the match:

df.loc[~ ((df['one'] >= df['two']) & (df['one'] <= df['three'])), 'que'] = ''

You need to use & and ~ rather than and and not because the & and ~ operators work element-by-element.

The final result:

df
Out[8]: 
  one  two three que
0  10  1.2   4.2  10
1  15   70  0.03    
2   8    5     0  

You could use apply() and do something like this

df['que'] = df.apply(lambda x : x['one'] if x['one'] >= x['two'] and x['one'] <= x['three'] else "", axis=1)

or if you prefer not to use a lambda

def que(x):
    if x['one'] >= x['two'] and x['one'] <= x['three']:
        return x['one']
    return ''
df['que'] = df.apply(que, axis=1)

You can use .equals for columns or entire dataframes.

df['col1'].equals(df['col2'])

If they're equal, that statement will return True, else False.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to pandas

xlrd.biffh.XLRDError: Excel xlsx file; not supported Pandas Merging 101 How to increase image size of pandas.DataFrame.plot in jupyter notebook? Trying to merge 2 dataframes but get ValueError Python Pandas User Warning: Sorting because non-concatenation axis is not aligned How to show all of columns name on pandas dataframe? Pandas/Python: Set value of one column based on value in another column Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Python convert object to float

Examples related to if-statement

How to use *ngIf else? SQL Server IF EXISTS THEN 1 ELSE 2 What is a good practice to check if an environmental variable exists or not? Using OR operator in a jquery if statement R multiple conditions in if statement Syntax for an If statement using a boolean How to have multiple conditions for one if statement in python Ifelse statement in R with multiple conditions If strings starts with in PowerShell Multiple conditions in an IF statement in Excel VBA

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe