[file] How to find the largest file in a directory and its subdirectories?

We're just starting a UNIX class and are learning a variety of Bash commands. Our assignment involves performing various commands on a directory that has a number of folders under it as well.

I know how to list and count all the regular files from the root folder using:

find . -type l | wc -l

But I'd like to know where to go from there in order to find the largest file in the whole directory. I've seen somethings regarding a du command, but we haven't learned that, so in the repertoire of things we've learned I assume we need to somehow connect it to the ls -t command.

And pardon me if my 'lingo' isn't correct, I'm still getting used to it!

This question is related to file bash directory find large-files

The answer is


ls -alR|awk '{ if ($5 > max) {max=$5;ff=$9}} END {print max "\t" ff;}'

That is quite simpler way to do it:

ls -l | tr -s " " " " | cut -d " " -f 5,9 | sort -n -r | head -n 1***

And you'll get this: 8445 examples.desktop


To list the larger file in a folder

ls -sh /pathFolder | sort -rh | head -n 1

The output of ls -sh is a sized s and human h understandable view of the file size number.

You could use ls -shS /pathFolder | head -n 1. The bigger S from ls already order the list from the larger files to the smaller ones but the first result its the sum of all files in that folder. So if you want just to list the bigger file, one file, you need to head -n 2 and check at the "second line result" or use the first example with ls sort head.


Quote from this link-

If you want to find and print the top 10 largest files names (not directories) in a particular directory and its sub directories

$ find . -printf '%s %p\n'|sort -nr|head

To restrict the search to the present directory use "-maxdepth 1" with find.

$ find . -maxdepth 1 -printf '%s %p\n'|sort -nr|head

And to print the top 10 largest "files and directories":

$ du -a . | sort -nr | head

** Use "head -n X" instead of the only "head" above to print the top X largest files (in all the above examples)


This lists files recursively if they're normal files, sorts by the 7th field (which is size in my find output; check yours), and shows just the first file.

find . -type f -ls | sort +7 | head -1

The first option to find is the start path for the recursive search. A -type of f searches for normal files. Note that if you try to parse this as a filename, you may fail if the filename contains spaces, newlines or other special characters. The options to sort also vary by operating system. I'm using FreeBSD.

A "better" but more complex and heavier solution would be to have find traverse the directories, but perhaps use stat to get the details about the file, then perhaps use awk to find the largest size. Note that the output of stat also depends on your operating system.


du -aS /PATH/TO/folder | sort -rn | head -2 | tail -1

or

du -aS /PATH/TO/folder | sort -rn | awk 'NR==2'


On Solaris I use:

find . -type f -ls|sort -nr -k7|awk 'NR==1{print $7,$11}' #formatted

or

find . -type f -ls | sort -nrk7 | head -1 #unformatted

because anything else posted here didn't work. This will find the largest file in $PWD and subdirectories.


find . -type f | xargs ls -lS | head -n 1

outputs

-rw-r--r--  1 nneonneo  staff  9274991 Apr 11 02:29 ./devel/misc/test.out

If you just want the filename:

find . -type f | xargs ls -1S | head -n 1

This avoids using awk and allows you to use whatever flags you want in ls.

Caveat. Because xargs tries to avoid building overlong command lines, this might fail if you run it on a directory with a lot of files because ls ends up executing more than once. It's not an insurmountable problem (you can collect the head -n 1 output from each ls invocation, and run ls -S again, looping until you have a single file), but it does mar this approach somewhat.


This will find the largest file or folder in your present working directory:

ls -S /path/to/folder | head -1

To find the largest file in all sub-directories:

find /path/to/folder -type f -exec ls -s {} \; | sort -nr | awk 'NR==1 { $1=""; sub(/^ /, ""); print }'

To find the top 25 files in the current directory and its subdirectories:

find . -type f -exec ls -al {} \; | sort -nr -k5 | head -n 25

This will output the top 25 files by sorting based on the size of the files via the "sort -nr -k5" piped command.

Same but with human-readable file sizes:

find . -type f -exec ls -alh {} \; | sort -hr -k5 | head -n 25


This script simplifies finding largest files for further action. I keep it in my ~/bin directory, and put ~/bin in my $PATH.

#!/usr/bin/env bash
# scriptname: above
# author: Jonathan D. Lettvin, 201401220235

# This finds files of size >= $1 (format ${count}[K|M|G|T], default 10G)
# using a reliable version-independent bash hash to relax find's -size syntax.
# Specifying size using 'T' for Terabytes is supported.
# Output size has units (K|M|G|T) in the left hand output column.

# Example:
#   ubuntu12.04$ above 1T
#   128T /proc/core

# http://stackoverflow.com/questions/1494178/how-to-define-hash-tables-in-bash
# Inspiration for hasch: thanks Adam Katz, Oct 18 2012 00:39
function hasch() { local hasch=`echo "$1" | cksum`; echo "${hasch//[!0-9]}"; }
function usage() { echo "Usage: $0 [{count}{k|K|m|M|g|G|t|T}"; exit 1; }
function arg1() {
    # Translate single arg (if present) into format usable by find.
    count=10; units=G;  # Default find -size argument to 10G.
    size=${count}${units}
    if [ -n "$1" ]; then
        for P in TT tT GG gG MM mM Kk kk; do xlat[`hasch ${P:0:1}`]="${P:1:1}"; done
        units=${xlat[`hasch ${1:(-1)}`]}; count=${1:0:(-1)}
        test -n "$units" || usage
        test -x $(echo "$count" | sed s/[0-9]//g) || usage
        if [ "$units" == "T" ]; then units="G"; let count=$count*1024; fi
        size=${count}${units}
    fi
}
function main() {
    sudo \
        find / -type f -size +$size -exec ls -lh {} \; 2>/dev/null | \
        awk '{ N=$5; fn=$9; for(i=10;i<=NF;i++){fn=fn" "$i};print N " " fn }'
}

arg1 $1
main $size

Linux Solution: For example, you want to see all files/folder list of your home (/) directory according to file/folder size (Descending order).

sudo du -xm / | sort -rn | more


Try the following one-liner (display top-20 biggest files):

ls -1Rs | sed -e "s/^ *//" | grep "^[0-9]" | sort -nr | head -n20

or (human readable sizes):

ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20

Works fine under Linux/BSD/OSX in comparison to other answers, as find's -printf option doesn't exist on OSX/BSD and stat has different parameters depending on OS. However the second command to work on OSX/BSD properly (as sort doesn't have -h), install sort from coreutils or remove -h from ls and use sort -nr instead.

So these aliases are useful to have in your rc files:

alias big='du -ah . | sort -rh | head -20'
alias big-files='ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20'

There is no simple command available to find out the largest files/directories on a Linux/UNIX/BSD filesystem. However, combination of following three commands (using pipes) you can easily find out list of largest files:

# du -a /var | sort -n -r | head -n 10

If you want more human readable output try:

$ cd /path/to/some/var
$ du -hsx * | sort -rh | head -10

Where,

  • Var is the directory you wan to search
  • du command -h option : display sizes in human readable format (e.g., 1K, 234M, 2G).
  • du command -s option : show only a total for each argument (summary).
  • du command -x option : skip directories on different file systems.
  • sort command -r option : reverse the result of comparisons.
  • sort command -h option : compare human readable numbers. This is GNU sort specific option only.
  • head command -10 OR -n 10 option : show the first 10 lines.

Try following command :

find /your/path -printf "%k %p\n" | sort -g -k 1,1 | awk '{if($1 > 500000) print $1/1024 "MB" " " $2 }' |tail -n 1 

This will print the largest file name and size and more than 500M. You can move the if($1 > 500000),and it will print the largest file in the directory.


Examples related to file

Gradle - Move a folder from ABC to XYZ Difference between opening a file in binary vs text Angular: How to download a file from HttpClient? Python error message io.UnsupportedOperation: not readable java.io.FileNotFoundException: class path resource cannot be opened because it does not exist Writing JSON object to a JSON file with fs.writeFileSync How to read/write files in .Net Core? How to write to a CSV line by line? Writing a dictionary to a text file? What are the pros and cons of parquet format compared to other formats?

Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?

Examples related to directory

Moving all files from one directory to another using Python What is the reason for the error message "System cannot find the path specified"? Get folder name of the file in Python How to rename a directory/folder on GitHub website? Change directory in Node.js command prompt Get the directory from a file path in java (android) python: get directory two levels up How to add 'libs' folder in Android Studio? How to create a directory using Ansible Troubleshooting misplaced .git directory (nothing to commit)

Examples related to find

Find a file by name in Visual Studio Code Explaining the 'find -mtime' command find files by extension, *.html under a folder in nodejs MongoDB Show all contents from all collections How can I find a file/directory that could be anywhere on linux command line? Get all files modified in last 30 days in a directory FileNotFoundError: [Errno 2] No such file or directory Linux find and grep command together find . -type f -exec chmod 644 {} ; Find all stored procedures that reference a specific column in some table

Examples related to large-files

How can I import a large (14 GB) MySQL dump file into a new MySQL database? Read and parse a Json File in C# How to find the largest file in a directory and its subdirectories? Reading large text files with streams in C# Working with huge files in VIM What is the fastest way to create a checksum for large files in C# How to read large text file on windows? Managing large binary files with Git Number of lines in a file in Java Text editor to open big (giant, huge, large) text files