[python] Setting values on a copy of a slice from a DataFrame

I have a small dataframe, say this one :

    Mass32      Mass44  
12  0.576703    0.496159
13  0.576658    0.495832
14  0.576703    0.495398    
15  0.576587    0.494786
16  0.576616    0.494473
...

I would like to have a rolling mean of column Mass32, so I do this:

x['Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)

It works as in I have a new column named Mass32s which contains what I expect it to contain but I also get the warning message:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

I'm wondering if there's a better way to do it, notably to avoid getting this warning message.

This question is related to python pandas

The answer is


This warning comes because your dataframe x is a copy of a slice. This is not easy to know why, but it has something to do with how you have come to the current state of it.

You can either create a proper dataframe out of x by doing

x = x.copy()

This will remove the warning, but it is not the proper way

You should be using the DataFrame.loc method, as the warning suggests, like this:

x.loc[:,'Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)