Placing @EdChum's very nice answer into a function count_unique_index
.
The unique method only works on pandas series, not on data frames.
The function below reproduces the behavior of the unique function in R:
unique returns a vector, data frame or array like x but with duplicate elements/rows removed.
And adds a count of the occurrences as requested by the OP.
df1 = pd.DataFrame({'A':['yes','yes','yes','yes','no','no','yes','yes','yes','no'],
'B':['yes','no','no','no','yes','yes','no','yes','yes','no']})
def count_unique_index(df, by):
return df.groupby(by).size().reset_index().rename(columns={0:'count'})
count_unique_index(df1, ['A','B'])
A B count
0 no no 1
1 no yes 2
2 yes no 4
3 yes yes 3