How to deal with SettingWithCopyWarning in Pandas

Question

Background I just upgraded my Pandas from 0 11 to 0 13 0rc1  Now  the application is popping out many new warnings  One of them like this  E  FinReporter FM EXT py 449  SettingWithCopyWarning  A value is trying to be set on a copy of a slice from a DataFrame  Try using  loc row index col indexer    value instead   quote df  TVol       quote df  TVol   TVOL SCALE  I want to know what exactly it means   Do I need to change something  How should I suspend the warning if I insist to use quote df  TVol       quote df  TVol   TVOL SCALE  The function that gives errors def  decode stock quote list of 150 stk str        quot  quot  quot decode the webpage and return dataframe quot  quot  quot       from cStringIO import StringIO      str of all    quot  quot  join list of 150 stk str       quote df   pd read csv StringIO str of all   sep      names list  ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg     dtype   A   object   B   object   C   np float64      quote df rename columns   A   STK    B   TOpen    C   TPCLOSE    D   TPrice    E   THigh    F   TLow    I   TVol    J   TAmt    e   TDate    f   TTime    inplace True      quote df   quote df ix    0 3 2 1 4 5 8 9 30 31       quote df  TClose     quote df  TPrice       quote df  RT         100    quote df  TPrice   quote df  TPCLOSE   - 1      quote df  TVol       quote df  TVol   TVOL SCALE     quote df  TAmt       quote df  TAmt   TAMT SCALE     quote df  STK ID     quote df  STK   str slice 13 19      quote df  STK Name     quote df  STK   str slice 21 30   decode  gb2312       quote df  TDate      quote df TDate map lambda x  x 0 4  x 5 7  x 8 10            return quote df  More error messages E  FinReporter FM EXT py 449  SettingWithCopyWarning  A value is trying to be set on a copy of a slice from a DataFrame  Try using  loc row index col indexer    value instead   quote df  TVol       quote df  TVol   TVOL SCALE E  FinReporter FM EXT py 450  SettingWithCopyWarning  A value is trying to be set on a copy of a slice from a DataFrame  Try using  loc row index col indexer    value instead   quote df  TAmt       quote df  TAmt   TAMT SCALE E  FinReporter FM EXT py 453  SettingWithCopyWarning  A value is trying to be set on a copy of a slice from a DataFrame  Try using  loc row index col indexer    value instead   quote df  TDate      quote df TDate map lambda x  x 0 4  x 5 7  x 8 10

User · Answer

As this question is already fully explained and discussed in existing answers I will just provide a neat pandas approach to the context manager using pandas.option_context (links to docs and example) - there is absolutely no need to create a custom class with all the dunder methods and other bells and whistles.

First the context manager code itself:

from contextlib import contextmanager

@contextmanager
def SuppressPandasWarning():
    with pd.option_context("mode.chained_assignment", None):
        yield

Then an example:

import pandas as pd
from string import ascii_letters

a = pd.DataFrame({"A": list(ascii_letters[0:4]), "B": range(0,4)})

mask = a["A"].isin(["c", "d"])
# Even shallow copy below is enough to not raise the warning, but why is a mystery to me.
b = a.loc[mask]  # .copy(deep=False)

# Raises the `SettingWithCopyWarning`
b["B"] = b["B"] * 2

# Does not!
with SuppressPandasWarning():
    b["B"] = b["B"] * 2

Worth noticing is that both approches do not modify a, which is a bit surprising to me, and even a shallow df copy with .copy(deep=False) would prevent this warning to be raised (as far as I understand shallow copy should at least modify a as well, but it doesn't. pandas magic.).

User · Answer

In general the point of the SettingWithCopyWarning is to show users  and especially new users  that they may be operating on a copy and not the original as they think  There are false positives  IOW if you know what you are doing it could be ok   One possibility is simply to turn off the  by default warn  warning as  Garrett suggest   Here is another option   In  1   df   DataFrame np random randn 5  2   columns list  AB     In  2   dfa   df ix     1  0    In  3   dfa is copy Out 3   True  In  4   dfa  A      2  usr local bin ipython 1  SettingWithCopyWarning  A value is trying to be set on a copy of a slice from a DataFrame  Try using  loc row index col indexer    value instead      usr local bin python   You can set the is copy flag to False  which will effectively turn off the check  for that object   In  5   dfa is copy   False  In  6   dfa  A      2   If you explicitly copy then no further warning will happen   In  7   dfa   df ix     1  0   copy    In  8   dfa  A      2   The code the OP is showing above  while legitimate  and probably something I do as well  is technically a case for this warning  and not a false positive  Another way to not have the warning would be to do the selection operation via reindex  e g   quote df   quote df reindex columns   STK           Or    quote df   quote df reindex   STK         axis 1     v 0 21

User · Answer

How to deal with SettingWithCopyWarning in Pandas   This post is meant for readers who   Would like to understand what this warning means Would like to understand different ways of suppressing this warning Would like to understand how to improve their code and follow good practices to avoid this warning in the future   Setup np random seed 0  df   pd DataFrame np random choice 10   3  5    columns list  ABCDE    df    A  B  C  D  E 0  5  0  3  3  7 1  9  3  5  2  4 2  7  6  8  8  1   What is the SettingWithCopyWarning  To know how to deal with this warning  it is important to understand what it means and why it is raised in the first place  When filtering DataFrames  it is possible slice index a frame to return either a view  or a copy  depending on the internal layout and various implementation details  A  quot view quot  is  as the term suggests  a view into the original data  so modifying the view may modify the original object  On the other hand  a  quot copy quot  is a replication of data from the original  and modifying the copy has no effect on the original  As mentioned by other answers  the SettingWithCopyWarning was created to flag  quot chained assignment quot  operations  Consider df in the setup above  Suppose you would like to select all values in column  quot B quot  where values in column  quot A quot  is  gt  5  Pandas allows you to do this in different ways  some more correct than others  For example  df df A  gt  5   B     1    3 2    6 Name  B  dtype  int64  And  df loc df A  gt  5   B    1    3 2    6 Name  B  dtype  int64  These return the same result  so if you are only reading these values  it makes no difference  So  what is the issue  The problem with chained assignment  is that it is generally difficult to predict whether a view or a copy is returned  so this largely becomes an issue when you are attempting to assign values back  To build on the earlier example  consider how this code is executed by the interpreter  df loc df A  gt  5   B     4   becomes df   setitem    df A  gt  5   B    4   With a single   setitem   call to df  OTOH  consider this code  df df A  gt  5   B     4   becomes df   getitem   df A  gt  5    setitem    B quot   4   Now  depending on whether   getitem   returned a view or a copy  the   setitem   operation may not work  In general  you should use loc for label-based assignment  and iloc for integer positional based assignment  as the spec guarantees that they always operate on the original  Additionally  for setting a single cell  you should use at and iat  More can be found in the documentation   Note All boolean indexing operations done with loc can also be done with iloc  The only difference is that iloc expects either integers positions for index or a numpy array of boolean values  and integer position indexes for the columns  For example  df loc df A  gt  5   B     4  Can be written nas df iloc  df A  gt  5  values  1    4  And  df loc 1   A     100  Can be written as df iloc 1  0    100  And so on    Just tell me how to suppress the warning  Consider a simple operation on the  quot A quot  column of df  Selecting  quot A quot  and dividing by 2 will raise the warning  but the operation will work  df2   df   A    df2  A      2  Library Frameworks Python framework Versions 3 6 lib python3 6 site-packages IPython   main   py 1  SettingWithCopyWarning   A value is trying to be set on a copy of a slice from a DataFrame  Try using  loc row indexer col indexer    value instead  df2      A 0  2 5 1  4 5 2  3 5  There are a couple ways of directly silencing this warning    recommended  Use loc to slice subsets   df2   df loc      A     df2  A      2       Does not raise    Change pd options mode chained assignment Can be set to None   quot warn quot   or  quot raise quot    quot warn quot  is the default  None will suppress the warning entirely  and  quot raise quot  will throw a SettingWithCopyError  preventing the operation from going through   pd options mode chained assignment   None  df2  A      2   Make a deepcopy  df2   df   A    copy deep True   df2  A      2     Peter Cotton in the comments  came up with a nice way of non-intrusively changing the mode  modified from this gist  using a context manager  to set the mode only as long as it is required  and the reset it back to the original state when finished   class ChainedAssignent      def   init   self  chained None           acceptable    None   warn    raise           assert chained in acceptable   quot chained must be in  quot    str acceptable          self swcw   chained      def   enter   self           self saved swcw   pd options mode chained assignment         pd options mode chained assignment   self swcw         return self      def   exit   self   args           pd options mode chained assignment   self saved swcw   The usage is as follows    some code here with ChainedAssignent        df2  A      2   more code follows  Or  to raise the exception with ChainedAssignent chained  raise        df2  A      2  SettingWithCopyError   A value is trying to be set on a copy of a slice from a DataFrame  Try using  loc row indexer col indexer    value instead   The  quot XY Problem quot   What am I doing wrong  A lot of the time  users attempt to look for ways of suppressing this exception without fully understanding why it was raised in the first place  This is a good example of an XY problem  where users attempt to solve a problem  quot Y quot  that is actually a symptom of a deeper rooted problem  quot X quot   Questions will be raised based on common problems that encounter this warning  and solutions will then be presented   Question 1 I have a DataFrame df        A  B  C  D  E     0  5  0  3  3  7     1  9  3  5  2  4     2  7  6  8  8  1  I want to assign values in col  quot A quot   gt  5 to 1000  My expected output is       A  B  C  D  E 0     5  0  3  3  7 1  1000  3  5  2  4 2  1000  6  8  8  1   Wrong way to do this  df A df A  gt  5    1000           works  because df A returns a view df df A  gt  5   A     1000        does not work df loc df A  5   A     1000     does not work  Right way using loc  df loc df A  gt  5   A     1000    Question 21 I am trying to set the value in cell  1   D   to 12345  My expected output is    A  B  C      D  E 0  5  0  3      3  7 1  9  3  5  12345  4 2  7  6  8      8  1  I have tried different ways of accessing this cell  such as df  D   1   What is the best way to do this  1  This question isn t specifically related to the warning  but it is good to understand how to do this particular operation correctly so as to avoid situations where the warning could potentially arise in future   You can use any of the following methods to do this  df loc 1   D     12345 df iloc 1  3    12345 df at 1   D     12345 df iat 1  3    12345    Question 3 I am trying to subset values based on some condition  I have a DataFrame    A  B  C  D  E 1  9  3  5  2  4 2  7  6  8  8  1  I would like to assign values in  quot D quot  to 123 such that  quot C quot     5  I tried df2 loc df2 C    5   D     123  Which seems fine but I am still getting the SettingWithCopyWarning  How do I fix this   This is actually probably because of code higher up in your pipeline  Did you create df2 from something larger  like df2   df df A  gt  5     In this case  boolean indexing will return a view  so df2 will reference the original  What you d need to do is assign df2 to a copy  df2   df df A  gt  5  copy     Or    df2   df loc df A  gt  5        Question 4 I m trying to drop column  quot C quot  in-place from      A  B  C  D  E 1  9  3  5  2  4 2  7  6  8  8  1  But using df2 drop  C   axis 1  inplace True   Throws SettingWithCopyWarning  Why is this happening   This is because df2 must have been created as a view from some other slicing operation  such as df2   df df A  gt  5   The solution here is to either make a copy   of df  or use loc  as before

User · Answer

Followup beginner question   remark  Maybe a clarification for other beginners like me  I come from R which seems to work a bit differently under the hood   The following harmless-looking and functional code kept producing the SettingWithCopy warning  and I couldn t figure out why  I had both read and understood the issued with  chained indexing   but my code doesn t contain any   def plot pdb  df  title    kw       df  target      df  ogg     df  ugg      2             But then  later  much too late  I looked at where the plot   function is called       df   data data  anz emw    gt  0      pixbuf   plot pdb  df  title    So  df  isn t a data frame but an object that somehow remembers that it was created by indexing a data frame  so is that a view   which would make the line in plot     df  target           equivalent to   data data  anz emw    gt  0   target           which is a chained indexing  Did I get that right   Anyway    def plot pdb  df  title    kw       df loc    target      df  ogg     df  ugg      2   fixed it

User · Answer

This should work   quote df loc    TVol     quote df  TVol   TVOL SCALE

User · Answer

Here I answer the question directly  How to deal with it   Make a  copy deep False  after you slice  See pandas DataFrame copy   Wait  doesn t a slice return a copy  After all  this is what the warning message is attempting to say  Read the long answer   import pandas as pd df   pd DataFrame   x   1 2 3      This gives a warning   df0   df df x gt 2  df0  foo      bar    This does not   df1   df df x gt 2  copy deep False  df1  foo      bar    Both df0 and df1 are DataFrame objects  but something about them is different that enables pandas to print the warning  Let s find out what it is   import inspect slice  df df x gt 2  slice copy   df df x gt 2  copy deep False  inspect getmembers slice  inspect getmembers slice copy    Using your diff tool of choice  you will see that beyond a couple of addresses  the only material difference is this                slice     slice copy      is copy   weakref   None           The method that decides whether to warn is DataFrame  check setitem copy which checks  is copy  So here you go  Make a copy so that your DataFrame is not  is copy   The warning is suggesting to use  loc  but if you use  loc on a frame that  is copy  you will still get the same warning  Misleading  Yes  Annoying  You bet  Helpful  Potentially  when chained assignment is used  But it cannot correctly detect chain assignment and prints the warning indiscriminately

User · Answer

Pandas dataframe copy warning When you go and do something like this  quote df   quote df ix    0 3 2 1 4 5 8 9 30 31    pandas ix in this case returns a new  stand alone dataframe  Any values you decide to change in this dataframe  will not change the original dataframe  This is what pandas tries to warn you about   Why  ix is a bad idea The  ix object tries to do more than one thing  and for anyone who has read anything about clean code  this is a strong smell  Given this dataframe  df   pd DataFrame   quot a quot    1 2 3 4    quot b quot    1 1 2 2     Two behaviors  dfcopy   df ix     quot a quot    dfcopy a ix 0    2  Behavior one  dfcopy is now a stand alone dataframe  Changing it will not change df df ix 0   quot a quot     3  Behavior two  This changes the original dataframe   Use  loc instead The pandas developers recognized that the  ix object was quite smelly speculatively  and thus created two new objects which helps in the accession and assignment of data   The other being  iloc   loc is faster  because it does not try to create a copy of the data   loc is meant to modify your existing dataframe inplace  which is more memory efficient   loc is predictable  it has one behavior   The solution What you are doing in your code example is loading a big file with lots of columns  then modifying it to be smaller  The pd read csv function can help you out with a lot of this and also make the loading of the file a lot faster  So instead of doing this quote df   pd read csv StringIO str of all   sep      names list  ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg     dtype   A   object   B   object   C   np float64  quote df rename columns   A   STK    B   TOpen    C   TPCLOSE    D   TPrice    E   THigh    F   TLow    I   TVol    J   TAmt    e   TDate    f   TTime    inplace True  quote df   quote df ix    0 3 2 1 4 5 8 9 30 31    Do this columns     STK    TPrice    TPCLOSE    TOpen    THigh    TLow    TVol    TAmt    TDate    TTime   df   pd read csv StringIO str of all   sep      usecols  0 3 2 1 4 5 8 9 30 31   df columns   columns  This will only read the columns you are interested in  and name them properly  No need for using the evil  ix object to do magical stuff

User · Answer

For me this issue occured in a following  simplified lt  example  And I was also able to solve it  hopefully with a correct solution    old code with warning   def update old dataframe old dataframe  new dataframe       for new index  new row in new dataframe iterrorws            old dataframe loc new index    update row old dataframe loc new index   new row   def update row old row  new row       for field in  list of columns             line with warning because of chain indexing old dataframe new index  field          old row field    new row field        return old row   This printed the warning for the line old row field    new row field   Since the rows in update row method are actually type Series  I replaced the line with   old row at field    new row at field    i e  method for accessing lookups for a Series  Eventhough both works just fine and the result is same  this way I don t have to disable the warnings   keep them for other chain indexing issues somewhere else    I hope this may help someone

User · Answer

Some may want to simply suppress the warning   class SupressSettingWithCopyWarning      def   enter   self           pd options mode chained assignment   None      def   exit   self   args           pd options mode chained assignment    warn   with SupressSettingWithCopyWarning         code that produces warning

User · Answer

You could avoid the whole problem like this  I believe   return       pd read csv StringIO str of all   sep      names list  ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg     dtype   A   object   B   object   C   np float64       rename columns   A   STK    B   TOpen    C   TPCLOSE    D   TPrice    E   THigh    F   TLow    I   TVol    J   TAmt    e   TDate    f   TTime    inplace True       ix    0 3 2 1 4 5 8 9 30 31        assign          TClose lambda df  df  TPrice            RT lambda df  100    df  TPrice   quote df  TPCLOSE   - 1           TVol lambda df  df  TVol   TVOL SCALE          TAmt lambda df  df  TAmt   TAMT SCALE          STK ID lambda df  df  STK   str slice 13 19           STK Name lambda df  df  STK   str slice 21 30   decode  gb2312            TDate lambda df  df TDate map lambda x  x 0 4  x 5 7  x 8 10              Using Assign  From the documentation  Assign new columns to a DataFrame  returning a new object  a copy  with all the original columns in addition to the new ones    See Tom Augspurger s article on method chaining in pandas  https   tomaugspurger github io method-chaining

User · Answer

To remove any doubt  my solution was to make a deep copy of the slice instead of a regular copy  This may not be applicable depending on your context  Memory constraints   size of the slice  potential for performance degradation - especially if the copy occurs in a loop like it did for me  etc     To be clear  here is the warning I received   opt anaconda3 lib python3 6 site-packages ipykernel   main   py 54  SettingWithCopyWarning  A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation  http   pandas pydata org pandas-docs stable indexing html indexing-view-versus-copy  Illustration I had doubts that the warning was thrown because of a column I was dropping on a copy of the slice  While not technically trying to set a value in the copy of the slice  that was still a modification of the copy of the slice  Below are the  simplified  steps I have taken to confirm the suspicion  I hope it will help those of us who are trying to understand the warning  Example 1  dropping a column on the original affects the copy We knew that already but this is a healthy reminder  This is NOT what the warning is about   gt  gt  data1     A    111  112  113    B   121  122  123    gt  gt  df1   pd DataFrame data1   gt  gt  df1      A   B 0   111 121 1   112 122 2   113 123    gt  gt  df2   df1  gt  gt  df2  A   B 0   111 121 1   112 122 2   113 123    Dropping a column on df1 affects df2  gt  gt  df1 drop  A   axis 1  inplace True   gt  gt  df2     B 0   121 1   122 2   123  It is possible to avoid changes made on df1 to affect df2  Note  you can avoid importing copy deepcopy by doing df copy   instead   gt  gt  data1     A    111  112  113    B   121  122  123    gt  gt  df1   pd DataFrame data1   gt  gt  df1  A   B 0   111 121 1   112 122 2   113 123   gt  gt  import copy  gt  gt  df2   copy deepcopy df1   gt  gt  df2 A   B 0   111 121 1   112 122 2   113 123    Dropping a column on df1 does not affect df2  gt  gt  df1 drop  A   axis 1  inplace True   gt  gt  df2     A   B 0   111 121 1   112 122 2   113 123  Example 2  dropping a column on the copy may affect the original This actually illustrates the warning   gt  gt  data1     A    111  112  113    B   121  122  123    gt  gt  df1   pd DataFrame data1   gt  gt  df1      A   B 0   111 121 1   112 122 2   113 123   gt  gt  df2   df1  gt  gt  df2      A   B 0   111 121 1   112 122 2   113 123    Dropping a column on df2 can affect df1   No slice involved here  but I believe the principle remains the same    Let me know if not  gt  gt  df2 drop  A   axis 1  inplace True   gt  gt  df1  B 0   121 1   122 2   123  It is possible to avoid changes made on df2 to affect df1  gt  gt  data1     A    111  112  113    B   121  122  123    gt  gt  df1   pd DataFrame data1   gt  gt  df1      A   B 0   111 121 1   112 122 2   113 123   gt  gt  import copy  gt  gt  df2   copy deepcopy df1   gt  gt  df2  A   B 0   111 121 1   112 122 2   113 123   gt  gt  df2 drop  A   axis 1  inplace True   gt  gt  df1  A   B 0   111 121 1   112 122 2   113 123  Cheers

User · Answer

I had been getting this issue with  apply   when assigning a new dataframe from a pre-existing dataframe on which i ve used the  query   method  For instance   prop df   df query  column     value    prop df  new column     prop df apply function  axis 1    Would return this error  The fix that seems to resolve the error in this case is by changing this to   prop df   df copy deep True  prop df   prop df query  column     value    prop df  new column     prop df apply function  axis 1    However  this is NOT efficient especially when using large dataframes  due to having to make a new copy   If you re using the  apply   method in generating a new column and its values  a fix that resolves the error and is more efficient is by adding  reset index drop True    prop df   df query  column     value    reset index drop True  prop df  new column     prop df apply function  axis 1

User · Answer

This topic is really confusing with Pandas  Luckily  it has a relatively simple solution  The problem is that it is not always clear whether data filtering operations  e g  loc  return a copy or a view of the DataFrame  Further use of such filtered DataFrame could therefore be confusing  The simple solution is  unless you need to work with very large sets of data   Whenever you need to update any values  always make sure that you explicitly copy the DataFrame before the assignment  df    Some DataFrame df   df loc    0 2     Some filtering  unsure whether a view or copy is returned  df   df copy      Ensuring a copy is made df df  quot Name quot       quot John quot      quot Johny quot     Assignment can be done now  no warning

User · Answer

If you have assigned the slice to a variable and want to set using the variable as in the following   df2   df df  A    gt  2  df2  B     value   And you do not want to use Jeffs solution because your condition computing df2 is to long or for some other reason  then you can use the following   df loc df2 index tolist     B     value   df2 index tolist   returns the indices from all entries in df2  which will then be used to set column B in the original dataframe

User · Answer

The SettingWithCopyWarning was created to flag potentially confusing  quot chained quot  assignments  such as the following  which does not always work as expected  particularly when the first selection returns a copy    see GH5390 and GH5597 for background discussion   df df  A    gt  2   B     new val    new val not set in df  The warning offers a suggestion to rewrite as follows  df loc df  A    gt  2   B     new val  However  this doesn t fit your usage  which is equivalent to  df   df df  A    gt  2  df  B     new val  While it s clear that you don t care about writes making it back to the original frame  since you are overwriting the reference to it   unfortunately this pattern cannot be differentiated from the first chained assignment example  Hence the  false positive  warning  The potential for false positives is addressed in the docs on indexing  if you d like to read further   You can safely disable this new warning with the following assignment  import pandas as pd pd options mode chained assignment   None    default  warn    Other Resources  pandas User Guide  Indexing and selecting data Python Data Science Handbook  Data Indexing and Selection Real Python  SettingWithCopyWarning in Pandas  Views vs Copies Dataquest  SettingwithCopyWarning  How to Fix This Warning in Pandas Towards Data Science  Explaining the SettingWithCopyWarning in pandas

[python] How to deal with SettingWithCopyWarning in Pandas

Examples related to python

Examples related to pandas

Examples related to dataframe

Examples related to chained-assignment