Multiple aggregations of the same column using pandas GroupBy agg

Question

Is there a pandas built-in way to apply two different aggregating functions f1  f2 to the same column df  returns    without having to call agg   multiple times   Example dataframe   import pandas as pd import datetime as dt  pd np random seed 0  df   pd DataFrame             date         dt date 2012  x  1  for x in range 1  11               returns     0 05   np random randn 10              dummy       np repeat 1  10        The syntactically wrong  but intuitively right  way to do it would be     Assume  f1  and  f2  are defined for aggregating  df groupby  dummy   agg   returns   f1   returns   f2     Obviously  Python doesn t allow duplicate keys  Is there any other manner for expressing the input to agg    Perhaps a list of tuples   column  function   would work better  to allow multiple functions applied to the same column  But agg   seems like it only accepts a dictionary   Is there a workaround for this besides defining an auxiliary function that just applies both of the functions inside of it   How would this work with aggregation anyway

User · Accepted Answer

You can simply pass the functions as a list:

In [20]: df.groupby("dummy").agg({"returns": [np.mean, np.sum]})
Out[20]:         
           mean       sum
dummy                    
1      0.036901  0.369012

or as a dictionary:

In [21]: df.groupby('dummy').agg({'returns':
                                  {'Mean': np.mean, 'Sum': np.sum}})
Out[21]: 
        returns          
           Mean       Sum
dummy                    
1      0.036901  0.369012

User · Answer

TLDR  Pandas groupby agg has a new  easier syntax for specifying  1  aggregations on multiple columns  and  2  multiple aggregations on a column  So  to do this for pandas    0 25  use   df groupby  dummy   agg Mean   returns    mean    Sum   returns    sum                Mean       Sum dummy                     1      0 036901  0 369012   OR   df groupby  dummy    returns   agg Mean  mean   Sum  sum               Mean       Sum dummy                     1      0 036901  0 369012     Pandas    0 25  Named Aggregation  Pandas has changed the behavior of GroupBy agg in favour of a more intuitive syntax for specifying named aggregations  See the 0 25 docs section on Enhancements as well as relevant GitHub issues GH18366 and GH26512   From the documentation       To support column-specific aggregation with control over the output   column names  pandas accepts the special syntax in GroupBy agg      known as    named aggregation     where         The keywords are the output column names   The values are tuples whose first element is the column to select and the second element is the aggregation to apply to that column    Pandas provides the pandas NamedAgg namedtuple with the fields     column    aggfunc   to make it clearer what the arguments are  As   usual  the aggregation can be a callable or a string alias       You can now pass a tuple via keyword arguments  The tuples follow the format of   lt colName gt    lt aggFunc gt     import pandas as pd  pd   version                                                                                                                                  0 25 0 dev0 840 g989f912ee     Setup df   pd DataFrame   kind     cat    dog    cat    dog                        height    9 1  6 0  9 5  34 0                       weight    7 9  7 5  9 9  198 0      df groupby  kind   agg      max height   height    max    min weight   weight    min            max height  min weight kind                         cat          9 5         7 9 dog         34 0         7 5   Alternatively  you can use pd NamedAgg  essentially a namedtuple  which makes things more explicit   df groupby  kind   agg      max height pd NamedAgg column  height   aggfunc  max         min weight pd NamedAgg column  weight   aggfunc  min            max height  min weight kind                         cat          9 5         7 9 dog         34 0         7 5   It is even simpler for Series  just pass the aggfunc to a keyword argument   df groupby  kind    height   agg max height  max   min height  min              max height  min height kind                         cat          9 5         9 1 dog         34 0         6 0          Lastly  if your column names aren t valid python identifiers  use a dictionary with unpacking   df groupby  kind    height   agg     max height    max             Pandas  lt  0 25  In more recent versions of pandas leading upto 0 24  if using a dictionary for specifying column names for the aggregation output  you will get a FutureWarning   df groupby  dummy   agg   returns     Mean    mean    Sum    sum       FutureWarning  using a dict with renaming is deprecated and will be removed    in a future version   Using a dictionary for renaming columns is deprecated in v0 20  On more recent versions of pandas  this can be specified more simply by passing a list of tuples  If specifying the functions this way  all functions for that column need to be specified as tuples of  name  function  pairs   df groupby  dummy   agg   returns      op1    sum      op2    mean               returns                       op1       op2 dummy                     1      0 328953  0 032895   Or   df groupby  dummy    returns   agg    op1    sum      op2    mean                  op1       op2 dummy                     1      0 328953  0 032895

User · Answer

Would something like this work   In  7   df groupby  dummy   returns agg   func1    lambda x  x sum     func2    lambda x  x prod     Out 7                  func2     func1 dummy                         1     -4 263768e-16 -0 188565

[python] Multiple aggregations of the same column using pandas GroupBy.agg()

Examples related to python

Examples related to pandas

Examples related to dataframe

Examples related to aggregate

Examples related to pandas-groupby