With the nice indexing methods in Pandas I have no problems extracting data in various ways. On the other hand I am still confused about how to change data in an existing DataFrame.
In the following code I have two DataFrames and my goal is to update values in a specific row in the first df from values of the second df. How can I achieve this?
import pandas as pd
df = pd.DataFrame({'filename' : ['test0.dat', 'test2.dat'],
'm': [12, 13], 'n' : [None, None]})
df2 = pd.DataFrame({'filename' : 'test2.dat', 'n':16}, index=[0])
# this overwrites the first row but we want to update the second
# df.update(df2)
# this does not update anything
df.loc[df.filename == 'test2.dat'].update(df2)
print(df)
gives
filename m n
0 test0.dat 12 None
1 test2.dat 13 None
[2 rows x 3 columns]
but how can I achieve this:
filename m n
0 test0.dat 12 None
1 test2.dat 13 16
[2 rows x 3 columns]
Update null elements with value in the same location in other. Combines a DataFrame with other DataFrame using func to element-wise combine columns. The row and column indexes of the resulting DataFrame will be the union of the two.
df1 = pd.DataFrame({'A': [None, 0], 'B': [None, 4]})
df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]})
df1.combine_first(df2)
A B
0 1.0 3.0
1 0.0 4.0
There are probably a few ways to do this, but one approach would be to merge the two dataframes together on the filename/m column, then populate the column 'n' from the right dataframe if a match was found. The n_x, n_y in the code refer to the left/right dataframes in the merge.
In[100] : df = pd.merge(df1, df2, how='left', on=['filename','m'])
In[101] : df
Out[101]:
filename m n_x n_y
0 test0.dat 12 None NaN
1 test2.dat 13 None 16
In[102] : df['n'] = df['n_y'].fillna(df['n_x'])
In[103] : df = df.drop(['n_x','n_y'], axis=1)
In[104] : df
Out[104]:
filename m n
0 test0.dat 12 None
1 test2.dat 13 16
I needed to update and add suffix to few rows of the dataframe on conditional basis based on the another column's value of the same dataframe -
df with column Feature and Entity and need to update Entity based on specific feature type
df2= df1 df.loc[df.Feature == 'dnb', 'Entity'] = 'duns_' + df.loc[df.Feature == 'dnb','Entity']
In SQL, I would have do it in one shot as
update table1 set col1 = new_value where col1 = old_value
but in Python Pandas, we could just do this:
data = [['ram', 10], ['sam', 15], ['tam', 15]]
kids = pd.DataFrame(data, columns = ['Name', 'Age'])
kids
which will generate the following output :
Name Age
0 ram 10
1 sam 15
2 tam 15
now we can run:
kids.loc[kids.Age == 15,'Age'] = 17
kids
which will show the following output
Name Age
0 ram 10
1 sam 17
2 tam 17
which should be equivalent to the following SQL
update kids set age = 17 where age = 15
If you have one large dataframe and only a few update values I would use apply like this:
import pandas as pd
df = pd.DataFrame({'filename' : ['test0.dat', 'test2.dat'],
'm': [12, 13], 'n' : [None, None]})
data = {'filename' : 'test2.dat', 'n':16}
def update_vals(row, data=data):
if row.filename == data['filename']:
row.n = data['n']
return row
df.apply(update_vals, axis=1)
Source: Stackoverflow.com