[python] Pandas - replacing column values

I know there are a number of topics on this question, but none of the methods worked for me so I'm posting about my specific situation

I have a dataframe that looks like this:

data = pd.DataFrame([[1,0],[0,1],[1,0],[0,1]], columns=["sex", "split"])
data['sex'].replace(0, 'Female')
data['sex'].replace(1, 'Male')
data

What I want to do is replace all 0's in the sex column with 'Female', and all 1's with 'Male', but the values within the dataframe don't seem to change when I use the code above

Am I using replace() incorrectly? Or is there a better way to do conditional replacement of values?

This question is related to python pandas

The answer is


You can also try using apply with get method of dictionary, seems to be little faster than replace:

data['sex'] = data['sex'].apply({1:'Male', 0:'Female'}.get)

Testing with timeit:

%%timeit
data['sex'].replace([0,1],['Female','Male'],inplace=True)

Result:

The slowest run took 5.83 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 510 µs per loop

Using apply:

%%timeit
data['sex'] = data['sex'].apply({1:'Male', 0:'Female'}.get)

Result:

The slowest run took 5.92 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 331 µs per loop

Note: apply with dictionary should be used if all the possible values of the columns in the dataframe are defined in the dictionary else, it will have empty for those not defined in dictionary.


Can try this too!
Create a dictionary of replacement values.

import pandas as pd
data = pd.DataFrame([[1,0],[0,1],[1,0],[0,1]], columns=["sex", "split"])

enter image description here

replace_dict= {0:'Female',1:'Male'}
print(replace_dict)

enter image description here

Use the map function for replacing values

data['sex']=data['sex'].map(replace_dict)

Output after replacing
enter image description here