How to access pandas groupby dataframe by key

Question

How do I access the corresponding groupby dataframe in a groupby object by the key   With the following groupby   rand   np random RandomState 1  df   pd DataFrame   A     foo    bar     3                      B   rand randn 6                       C   rand randint 0  20  6    gb   df groupby   A      I can iterate through it to get the keys and groups   In  11   for k  gp in gb               print  key     str k               print gp key bar      A         B   C 1  bar -0 611756  18 3  bar -1 072969  10 5  bar -2 301539  18 key foo      A         B   C 0  foo  1 624345   5 2  foo -0 528172  11 4  foo  0 865408  14   I would like to be able to access a group by its key   In  12   gb  foo   Out 12          A         B   C 0  foo  1 624345   5 2  foo -0 528172  11 4  foo  0 865408  14   But when I try doing that with gb   foo     I get this weird pandas core groupby DataFrameGroupBy object thing which doesn t seem to have any methods that correspond to the DataFrame I want   The best I could think of is   In  13   def gb df key gb  key  orig df                ix   gb indices key               return orig df ix ix            gb df key gb   foo   df  Out 13        A         B   C 0  foo  1 624345   5 2  foo -0 528172  11 4  foo  0 865408  14     but this is kind of nasty  considering how nice pandas usually is at these things  What s the built-in way of doing this

User · Answer

Wes McKinney  pandas  author  in Python for Data Analysis provides the following recipe   groups   dict list gb     which returns a dictionary whose keys are your group labels and whose values are DataFrames  i e   groups  foo     will yield what you are looking for        A         B   C 0  foo  1 624345   5 2  foo -0 528172  11 4  foo  0 865408  14

User · Answer

You can use the get group method  In  21   gb get group  foo   Out 21         A         B   C 0  foo  1 624345   5 2  foo -0 528172  11 4  foo  0 865408  14  Note  This doesn t require creating an intermediary dictionary   copy of every subdataframe for every group  so will be much more memory-efficient than creating the naive dictionary with dict iter gb    This is because it uses data-structures already available in the groupby object   You can select different columns using the groupby slicing  In  22   gb   quot A quot    quot B quot    get group  quot foo quot   Out 22        A         B 0  foo  1 624345 2  foo -0 528172 4  foo  0 865408  In  23   gb  quot C quot   get group  quot foo quot   Out 23   0     5 2    11 4    14 Name  C  dtype  int64

User · Answer

Rather than  gb get group  foo     I prefer using gb groups  df loc gb groups  foo      Because in this way you can choose multiple columns as well  for example   df loc gb groups  foo     A   B

User · Answer

I was looking for a way to sample a few members of the GroupBy obj - had to address the posted question to get this done  create groupby object based on some key column grouped   df groupby  some key    pick N dataframes and grab their indices sampled df i    random sample grouped indices  N   grab the groups df list    map lambda df i  grouped get group df i   sampled df i   optionally - turn it all back into a single dataframe object sampled df   pd concat df list  axis 0  join  outer

User · Answer

gb   df groupby   A     gb groups   grouped df groups   If you are looking for selective groupby objects then  do  gb groups keys    and input desired key into the following key list    gb groups keys    key list    key1  key2  key3 and so on      for key  values in gb groups iteritems        if key in key list          print df ix values     n

[python] How to access pandas groupby dataframe by key

Examples related to python

Examples related to pandas

Examples related to dataframe

Examples related to group-by

Examples related to pandas-groupby