[python] Pandas aggregate count distinct

Just adding to the answers already given, the solution using the string "nunique" seems much faster, tested here on ~21M rows dataframe, then grouped to ~2M

%time _=g.agg({"id": lambda x: x.nunique()})
CPU times: user 3min 3s, sys: 2.94 s, total: 3min 6s
Wall time: 3min 20s

%time _=g.agg({"id": pd.Series.nunique})
CPU times: user 3min 2s, sys: 2.44 s, total: 3min 4s
Wall time: 3min 18s

%time _=g.agg({"id": "nunique"})
CPU times: user 14 s, sys: 4.76 s, total: 18.8 s
Wall time: 24.4 s