pandas groupby sort within groups

Question

I want to group my dataframe by two columns and then sort the aggregated results within the groups   In  167   df  Out 167   count   job source 0   2   sales   A 1   4   sales   B 2   6   sales   C 3   3   sales   D 4   7   sales   E 5   5   market  A 6   3   market  B 7   2   market  C 8   4   market  D 9   1   market  E  In  168   df groupby   job   source    agg   count  sum    Out 168               count job     source   market  A   5         B   3         C   2         D   4         E   1 sales   A   2         B   4         C   6         D   3         E   7   I would now like to sort the count column in descending order within each of the groups  And then take only the top three rows  To get something like               count job     source   market  A   5         D   4         B   3 sales   E   7         C   6         B   4

User · Answer

You can do it in one line - df groupby   job    apply lambda x  x sort values   count    ascending False  head 3   drop  job   axis 1    what apply   does is that it takes each group of groupby and assigns it to the x in lambda function

User · Answer

Try this Instead  simple way to do  groupby  and sorting in descending order  df groupby   companyName     overallRating   sum   sort values ascending False  head 20

User · Answer

What you want to do is actually again a groupby  on the result of the first groupby   sort and take the first three elements per group  Starting from the result of the first groupby  In  60   df agg   df groupby   job   source    agg   count  sum    We group by the first level of the index  In  63   g   df agg  count   groupby  job   group keys False   Then we want to sort   order   each group and take the first three elements  In  64   res   g apply lambda x  x sort values ascending False  head 3    However  for this  there is a shortcut function to do this  nlargest  In  65   g nlargest 3  Out 65   job     source market  A         5         D         4         B         3 sales   E         7         C         6         B         4 dtype  int64  So in one go  this looks like  df agg  count   groupby  job   group keys False  nlargest 3

User · Answer

Here s other example of taking top 3 on sorted order  and sorting within the groups   In  43   import pandas as pd                                                                                                                                                         In  44    df   pd DataFrame   name    Foo    Foo    Baar    Foo    Baar    Foo    Baar    Baar     count 1   5 10 12 15 20 25 30 35    count 2    100 150 100 25 250 300 400 500     In  45   df                                                                                                                                                                         Out 45       count 1  count 2  name 0        5      100   Foo 1       10      150   Foo 2       12      100  Baar 3       15       25   Foo 4       20      250  Baar 5       25      300   Foo 6       30      400  Baar 7       35      500  Baar       Top 3 on sorted order  In  46   df groupby   name     count 1   nlargest 3                                                                                                                                 Out 46    name    Baar  7    35       6    30       4    20 Foo   5    25       3    15       1    10 dtype  int64       Sorting within groups based on column  count 1   In  48   df groupby   name    apply lambda x  x sort values   count 1    ascending   False   reset index drop True  Out 48       count 1  count 2  name 0       35      500  Baar 1       30      400  Baar 2       20      250  Baar 3       12      100  Baar 4       25      300   Foo 5       15       25   Foo 6       10      150   Foo 7        5      100   Foo

User · Answer

You could also just do it in one go  by doing the sort first and using head to take the first 3 of each group    In 34   df sort values   job   count   ascending False  groupby  job   head 3   Out 35       count     job source 4      7   sales      E 2      6   sales      C 1      4   sales      B 5      5  market      A 8      4  market      D 6      3  market      B

User · Answer

If you don t need to sum a column  then use  tvashtar s answer  If you do need to sum  then you can use  joris  answer or this one which is very similar to it   df groupby   job    apply lambda x   x groupby  source                                          sum                                          sort values  count   ascending False                                         head 3

[python] pandas groupby sort within groups

Examples related to python

Examples related to sorting

Examples related to pandas

Examples related to group-by