I think df['word'].value_counts()
should serve. By skipping the groupby machinery, you'll save some time. I'm not sure why count
should be much slower than max
. Both take some time to avoid missing values. (Compare with size
.)
In any case, value_counts has been specifically optimized to handle object type, like your words, so I doubt you'll do much better than that.