Pandas DataFrame replace all values in a column based on condition

Question

I have a simple DataFrame like the following     I want to select all values from the  First Season  column and replace those that are over 1990 by 1  In this example  only Baltimore Ravens would have the 1996 replaced by 1  keeping the rest of the data intact    I have used the following   df loc  df  First Season    gt  1990     1   But  it replaces all the values in that row by 1  and not just the values in the  First Season  column   How can I replace just the values from that column

User · Answer

df  First Season   loc  df  First Season    gt  1990     1   strange that nobody has this answer  the only missing part of your code is the   First Season   right after df and just remove your curly brackets inside

User · Answer

We can update the First Season column in df with the following syntax  df  First Season     expression for new values  To map the values in First Season we can use pandas     map   method with the below syntax  data frame   column    map   initial value 1   updated value 1   initial value 2   updated value 2

User · Answer

You need to select that column   In  41   df loc df  First Season    gt  1990   First Season     1 df  Out 41                    Team  First Season  Total Games 0      Dallas Cowboys          1960          894 1       Chicago Bears          1920         1357 2   Green Bay Packers          1921         1339 3      Miami Dolphins          1966          792 4    Baltimore Ravens             1          326 5  San Franciso 49ers          1950         1003   So the syntax here is   df loc  lt mask gt  here mask is generating the labels to index     lt optional column s  gt      You can check the docs and also the 10 minutes to pandas which shows the semantics  EDIT  If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int this will convert True and False to 1 and 0 respectively   In  43   df  First Season      df  First Season    gt  1990  astype int  df  Out 43                    Team  First Season  Total Games 0      Dallas Cowboys             0          894 1       Chicago Bears             0         1357 2   Green Bay Packers             0         1339 3      Miami Dolphins             0          792 4    Baltimore Ravens             1          326 5  San Franciso 49ers             0         1003

User · Answer

Another option is to use a list comprehension  df  First Season      1 if year  gt  1990 else year for year in df  First Season

User · Answer

df  quot First season quot     df  quot First season quot   apply lambda x   1 if x  gt  1990 else x

User · Answer

for single condition  ie     employrate    gt  70           country        employrate alcconsumption 0  Afghanistan  55 7000007629394             03 1      Albania  51 4000015258789           7 29 2      Algeria              50 5             69 3      Andorra                            10 17 4       Angola  75 6999969482422           5 57   use this    df loc df  employrate    gt  70   employrate     7            country  employrate alcconsumption 0  Afghanistan   55 700001             03 1      Albania   51 400002           7 29 2      Algeria   50 500000             69 3      Andorra         nan          10 17 4       Angola    7 000000           5 57   therefore syntax here is   df loc  lt mask gt  here mask is generating the labels to index     lt optional column s  gt        For multiple conditions ie   df  employrate    lt  55   amp   df  employrate    gt  50   use this   df  employrate     np where      df  employrate    lt  55   amp   df  employrate    gt  50    11  df  employrate            out 108          country  employrate alcconsumption 0  Afghanistan   55 700001             03 1      Albania   11 000000           7 29 2      Algeria   11 000000             69 3      Andorra         nan          10 17 4       Angola   75 699997           5 57   therefore syntax here is    df   lt column name gt      np where   lt filter 1 gt     amp    lt filter 2 gt      lt new value gt   df  column name

User · Answer

A bit late to the party but still - I prefer using numpy where   import numpy as np df  First Season     np where df  First Season    gt  1990  1  df  First Season

User · Answer

df loc df  First season    gt  1990   First Season     1  Explanation  df loc takes two arguments   row index  and  column index   We are checking if the value is greater than 1990 of each row value  under  quot First season quot  column and then we replacing it with 1

[python] Pandas DataFrame: replace all values in a column, based on condition

Examples related to python

Examples related to pandas

Examples related to dataframe