Assumed you have data frame as below
df1
A B C D
4 -1.0 -1.0 -1.0 -1.0
5 -1.0 -1.0 -1.0 -1.0
6 -1.0 -1.0 -1.0 -1.0
6 -1.0 -1.0 -1.0 -1.0
When you would like create another df2
which is identical to df1
, without copy
df2=df1
df2
A B C D
4 -1.0 -1.0 -1.0 -1.0
5 -1.0 -1.0 -1.0 -1.0
6 -1.0 -1.0 -1.0 -1.0
6 -1.0 -1.0 -1.0 -1.0
And would like modify the df2 value only as below
df2.iloc[0,0]='changed'
df2
A B C D
4 changed -1.0 -1.0 -1.0
5 -1 -1.0 -1.0 -1.0
6 -1 -1.0 -1.0 -1.0
6 -1 -1.0 -1.0 -1.0
At the same time the df1 is changed as well
df1
A B C D
4 changed -1.0 -1.0 -1.0
5 -1 -1.0 -1.0 -1.0
6 -1 -1.0 -1.0 -1.0
6 -1 -1.0 -1.0 -1.0
Since two df as same object
, we can check it by using the id
id(df1)
140367679979600
id(df2)
140367679979600
So they as same object and one change another one will pass the same value as well.
If we add the copy
, and now df1
and df2
are considered as different object
, if we do the same change to one of them the other will not change.
df2=df1.copy()
id(df1)
140367679979600
id(df2)
140367674641232
df1.iloc[0,0]='changedback'
df2
A B C D
4 changed -1.0 -1.0 -1.0
5 -1 -1.0 -1.0 -1.0
6 -1 -1.0 -1.0 -1.0
6 -1 -1.0 -1.0 -1.0
Good to mention, when you subset the original dataframe, it is safe to add the copy as well in order to avoid the SettingWithCopyWarning