Via awk:
awk '{dups[$1]++} END{for (num in dups) {print num,dups[num]}}' data
In awk 'dups[$1]++'
command, the variable $1
holds the entire contents of column1 and square brackets are array access. So, for each 1st column of line in data
file, the node of the array named dups
is incremented.
And at the end, we are looping over dups
array with num
as variable and print the saved numbers first then their number of duplicated value by dups[num]
.
Note that your input file has spaces on end of some lines, if you clear up those, you can use $0
in place of $1
in command above :)