[file] Find duplicate lines in a file and count how many time each line was duplicated?

Via :

awk '{dups[$1]++} END{for (num in dups) {print num,dups[num]}}' data

In awk 'dups[$1]++' command, the variable $1 holds the entire contents of column1 and square brackets are array access. So, for each 1st column of line in data file, the node of the array named dups is incremented.

And at the end, we are looping over dups array with num as variable and print the saved numbers first then their number of duplicated value by dups[num].

Note that your input file has spaces on end of some lines, if you clear up those, you can use $0 in place of $1 in command above :)

Examples related to file

Gradle - Move a folder from ABC to XYZ Difference between opening a file in binary vs text Angular: How to download a file from HttpClient? Python error message io.UnsupportedOperation: not readable java.io.FileNotFoundException: class path resource cannot be opened because it does not exist Writing JSON object to a JSON file with fs.writeFileSync How to read/write files in .Net Core? How to write to a CSV line by line? Writing a dictionary to a text file? What are the pros and cons of parquet format compared to other formats?

Examples related to count

Count the Number of Tables in a SQL Server Database SQL count rows in a table How to count the occurrence of certain item in an ndarray? Laravel Eloquent - distinct() and count() not working properly together How to count items in JSON data Powershell: count members of a AD group How to count how many values per level in a given factor? Count number of rows by group using dplyr C++ - how to find the length of an integer JPA COUNT with composite primary key query not working

Examples related to find

Find a file by name in Visual Studio Code Explaining the 'find -mtime' command find files by extension, *.html under a folder in nodejs MongoDB Show all contents from all collections How can I find a file/directory that could be anywhere on linux command line? Get all files modified in last 30 days in a directory FileNotFoundError: [Errno 2] No such file or directory Linux find and grep command together find . -type f -exec chmod 644 {} ; Find all stored procedures that reference a specific column in some table

Examples related to duplicates

Remove duplicates from dataframe, based on two columns A,B, keeping row with max value in another column C Remove duplicates from a dataframe in PySpark How to "select distinct" across multiple data frame columns in pandas? How to find duplicate records in PostgreSQL Drop all duplicate rows across multiple columns in Python Pandas Left Join without duplicate rows from left table Finding duplicate integers in an array and display how many times they occurred How do I use SELECT GROUP BY in DataTable.Select(Expression)? How to delete duplicate rows in SQL Server? Python copy files to a new directory and rename if file name already exists

Examples related to lines

Plot multiple lines (data series) each with unique color in R More than 1 row in <Input type="textarea" /> Skip first couple of lines while reading lines in Python file remove empty lines from text file with PowerShell Find duplicate lines in a file and count how many time each line was duplicated? Batch / Find And Edit Lines in TXT file count (non-blank) lines-of-code in bash