Python Pandas pivot table with aggfunc count unique distinct

Question

df2   pd DataFrame   X      X1    X1    X1    X1     Y      Y2   Y1   Y1   Y1     Z      Z3   Z1   Z1   Z2          X   Y   Z 0  X1  Y2  Z3 1  X1  Y1  Z1 2  X1  Y1  Z1 3  X1  Y1  Z2  g df2 groupby  X    pd pivot table g  values  X   rows  Y   cols  Z   margins False  aggfunc  count        Traceback  most recent call last       AttributeError   Index  object   has no attribute  index    How do I get a Pivot Table with counts of unique values of one DataFrame column for two other columns  Is there aggfunc for count unique  Should I be using np bincount     NB  I am aware of  Series  values counts   however I need a pivot table     EDIT  The output should be   Z   Z1  Z2  Z3 Y              Y1   1   1 NaN Y2 NaN NaN   1

User · Answer

Since none of the answers are up to date with the last version of Pandas  I am writing another solution for this problem   In  1   import pandas as pd    Set exemple df2   pd DataFrame   X      X1    X1    X1    X1     Y      Y2   Y1   Y1   Y1     Z      Z3   Z1   Z1   Z2        Pivot pd crosstab index df2  Y    columns df2  Z    values df2  X    aggfunc pd Series nunique   Out  1   Z   Z1  Z2  Z3 Y            Y1  1 0 1 0 NaN Y2  NaN NaN 1 0

User · Answer

For best performance I recommend doing DataFrame drop duplicates followed up aggfunc  count    Others are correct that aggfunc pd Series nunique will work  This can be slow  however  if the number of index groups you have is large   1000    So instead of  to quote  Javier   df2 pivot table  X    Y    Z   aggfunc pd Series nunique    I suggest  df2 drop duplicates   X    Y    Z    pivot table  X    Y    Z   aggfunc  count     This works because it guarantees that every subgroup  each combination of   Y    Z    will have unique  non-duplicate  values of  X

User · Answer

aggfunc pd Series nunique provides distinct count  Full Code  df2 pivot table values  X   rows  Y   cols  Z                             aggfunc pd Series nunique   Credit to  hume for this solution  see comment under the accepted answer   Adding as an answer here for better discoverability

User · Answer

Do you mean something like this   In  39   df2 pivot table values  X   rows  Y   cols  Z                             aggfunc lambda x  len x unique     Out 39    Z   Z1  Z2  Z3 Y              Y1   1   1 NaN Y2 NaN NaN   1   Note that using len assumes you don t have NAs in your DataFrame  You can do x value counts   count   or len x dropna   unique    otherwise

User · Answer

You can construct a pivot table for each distinct value of X  In this case    for xval  xgroup in g      ptable   pd pivot table xgroup  rows  Y   cols  Z            margins False  aggfunc numpy size    will construct a pivot table for each value of X  You may want to index ptable using the xvalue  With this code  I get  for X1        X         Z   Z1  Z2  Z3 Y              Y1   2   1 NaN Y2 NaN NaN   1

User · Answer

aggfunc pd Series nunique will only count unique values for a series - in this case count the unique values for a column  But this doesn t quite reflect as an alternative to aggfunc  count  For simple counting  it better to use aggfunc pd Series count

User · Answer

This is a good way of counting entries within  pivot table   df2 pivot table values  X   index   Y   Z    columns  X   aggfunc  count             X1  X2 Y   Z        Y1  Z1   1   1     Z2   1  NaN Y2  Z3   1  NaN

User · Answer

Since at least version 0 16 of pandas  it does not take the parameter  rows   As of 0 23  the solution would be   df2 pivot table values  X   index  Y   columns  Z   aggfunc pd Series nunique    which returns   Z    Z1   Z2   Z3 Y                 Y1  1 0  1 0  NaN Y2  NaN  NaN  1 0

[python] Python Pandas : pivot table with aggfunc = count unique distinct

Examples related to python

Examples related to pandas

Examples related to pivot-table