[linux] Using ls to list directories and their total sizes

Is it possible to use ls in Unix to list the total size of a sub-directory and all its contents as opposed to the usual 4K that (I assume) is just the directory file itself?

total 12K
drwxrwxr-x  6 *** *** 4.0K 2009-06-19 10:10 branches
drwxrwxr-x 13 *** *** 4.0K 2009-06-19 10:52 tags
drwxrwxr-x 16 *** *** 4.0K 2009-06-19 10:02 trunk

After scouring the man pages I'm coming up empty.

This question is related to linux unix

The answer is


place in init script like .bashrc ... adjust def as needed.

duh() {
  # shows disk utilization for a path and depth level
  path="${1:-$PWD}"
  level="${2:-0}"
  du "$path" -h --max-depth="$level"
}

These are all great suggestions, but the one I use is:

du -ksh * | sort -n -r

-ksh makes sure the files and folders are listed in a human-readable format and in megabytes, kilobytes, etc. Then you sort them numerically and reverse the sort so it puts the bigger ones first.

The only downside to this command is that the computer does not know that Gigabyte is bigger than Megabyte so it will only sort by numbers and you will often find listings like this:

120K
12M
4G

Just be careful to look at the unit.

This command also works on the Mac (whereas sort -h does not for example).


To display it in ls -lh format, use:

(du -sh ./*; ls -lh --color=no) | awk '{ if($1 == "total") {X = 1} else if (!X) {SIZES[$2] = $1} else { sub($5 "[ ]*", sprintf("%-7s ", SIZES["./" $9]), $0); print $0} }'

Awk code explained:

if($1 == "total") { // Set X when start of ls is detected
  X = 1 
} else if (!X) { // Until X is set, collect the sizes from `du`
  SIZES[$2] = $1
} else {
  // Replace the size on current current line (with alignment)
  sub($5 "[ ]*", sprintf("%-7s ", SIZES["./" $9]), $0); 
  print $0
}

Sample output:

drwxr-xr-x 2 root     root 4.0K    Feb 12 16:43 cgi-bin
drwxrws--- 6 root     www  20M     Feb 18 11:07 document_root
drwxr-xr-x 3 root     root 1.3M    Feb 18 00:18 icons
drwxrwsr-x 2 localusr www  8.0K    Dec 27 01:23 passwd

du -sch * in the same directory.


It is important to note here that du gives you disk usage. Different machines can use different block sizes, so on one machine a block could be 4096 bytes and another machine could contain block sizes of 2048. If I put 10 1 byte files in a machine using 4096 bytes blocks and 10 1 byte file in a machine using 2048 bytes blocks, du -h would report ~40k and ~20k respectively.

If you want to know the size of all the files in a directory, for each directory you can do something like:

for x in ./*;
do
    if [[ -f "$x" ]]; then
        ls -al "$x"
    fi
done | awk '{print $6}' | awk '{s+=$1}END{print s}'

This would give you the total size of all the files in a directory.


just a warning, if you want to compare sizes of files. du produces different results depending on file system, block size, ... .

It may happen that the size of the files is different, e.g. comparing the same directory on your local hard-disk and a USB mass storage device. I use the following script, including ls to sum up the directory size. The result in in bytes taking all sub directories into account.

echo "[GetFileSize.sh] target directory: \"$1\""

iRetValue=0

uiLength=$(expr length "$1")
if [ $uiLength -lt 2 ]; then
  echo "[GetFileSize.sh] invalid target directory: \"$1\" - exiting!"
  iRetValue=-1
else
  echo "[GetFileSize.sh] computing size of files..."

  # use ls to compute total size of all files - skip directories as they may
  # show different sizes, depending on block size of target disk / file system
  uiTotalSize=$(ls -l -R $1 | grep -v ^d | awk '{total+=$5;} END {print total;}')
  uiLength=$(expr length "$uiTotalSize")
  if [ $uiLength -lt 1 ]; then
    uiTotalSize=0
  fi
  echo -e "[GetFileSize.sh] total target file size: \"$uiTotalSize\""

fi

exit "$iRetValue"

I always use du -sk (-k flag showing file size in kilobytes) instead.


The following is easy to remember

ls -ltrapR

list directory contents

-l use a long listing format

-t sort by modification time, newest first

-r, --reverse reverse order while sorting

-a, --all do not ignore entries starting with .

-p, --indicator-style=slash append / indicator to directories

-R, --recursive list subdirectories recursively

https://explainshell.com/explain?cmd=ls+-ltrapR


I ran into an issue similar to what Martin Wilde described, in my case comparing the same directory on two different servers after mirroring with rsync.

Instead of using a script I added the -b flag to the du which counts the size in bytes and as far as I can determine eliminated the differences on the two servers. You still can use -s -h to get a comprehensible output.


To display current directory's files and subdirectories sizes recursively:

du -h .

To display the same size information but without printing their sub directories recursively (which can be a huge list), just use the --max-depth option:

du -h --max-depth=1 .

If you want more control over the size that you want to list the directories over, you can use the threshold (-t) switch as in:

$ du -ht 1000000000 | sort --reverse

du - disk usage
h - human readable format
t - threshold size

Here, we want to list all directories which are greater than 1GB in size.

$ du -ht 1G | sort --reverse

Explanation:

Units that are described in wiki follows:

K, M, G, T, P, E, Z, Y (powers of 1024) or
KB, MB, GB, TB, PB, EB, ZB, YB (powers of 1000).


look at du command for that


du -S

du have another useful option: -S, --separate-dirs telling du not include size of subdirectories - handy on some occasions.

Example 1 - shows only the file sizes in a directory:

du -Sh  * 
3,1G    10/CR2
280M    10

Example 2 - shows the file sizes and subdirectories in directory:

du -h  * 
3,1G    10/CR2 
3,4G    10

The command you want is 'du -sk' du = "disk usage"

The -k flag gives you output in kilobytes, rather than the du default of disk sectors (512-byte blocks).

The -s flag will only list things in the top level directory (i.e., the current directory, by default, or the directory specified on the command line). It's odd that du has the opposite behavior of ls in this regard. By default du will recursively give you the disk usage of each sub-directory. In contrast, ls will only give list files in the specified directory. (ls -R gives you recursive behavior.)


Put this shell function declaration in your shell initialization scripts:

function duls {
    paste <( du -hs -- "$@" | cut -f1 ) <( ls -ld -- "$@" )
}

I called it duls because it shows the output from both du and ls (in that order):

$ duls
210M    drwxr-xr-x  21 kk  staff  714 Jun 15 09:32 .

$ duls *
 36K    -rw-r--r--   1 kk  staff    35147 Jun  9 16:03 COPYING
8.0K    -rw-r--r--   1 kk  staff     6962 Jun  9 16:03 INSTALL
 28K    -rw-r--r--   1 kk  staff    24816 Jun 10 13:26 Makefile
4.0K    -rw-r--r--   1 kk  staff       75 Jun  9 16:03 Makefile.am
 24K    -rw-r--r--   1 kk  staff    24473 Jun 10 13:26 Makefile.in
4.0K    -rw-r--r--   1 kk  staff     1689 Jun  9 16:03 README
120K    -rw-r--r--   1 kk  staff   121585 Jun 10 13:26 aclocal.m4
684K    drwxr-xr-x   7 kk  staff      238 Jun 10 13:26 autom4te.cache
128K    drwxr-xr-x   8 kk  staff      272 Jun  9 16:03 build
 60K    -rw-r--r--   1 kk  staff    60083 Jun 10 13:26 config.log
 36K    -rwxr-xr-x   1 kk  staff    34716 Jun 10 13:26 config.status
264K    -rwxr-xr-x   1 kk  staff   266637 Jun 10 13:26 configure
8.0K    -rw-r--r--   1 kk  staff     4280 Jun 10 13:25 configure.ac
7.0M    drwxr-xr-x   8 kk  staff      272 Jun 10 13:26 doc
2.3M    drwxr-xr-x  28 kk  staff      952 Jun 10 13:26 examples
6.2M    -rw-r--r--   1 kk  staff  6505797 Jun 15 09:32 mrbayes-3.2.7-dev.tar.gz
 11M    drwxr-xr-x  42 kk  staff     1428 Jun 10 13:26 src

$ duls doc
7.0M    drwxr-xr-x  8 kk  staff  272 Jun 10 13:26 doc

$ duls [bM]*
 28K    -rw-r--r--  1 kk  staff  24816 Jun 10 13:26 Makefile
4.0K    -rw-r--r--  1 kk  staff     75 Jun  9 16:03 Makefile.am
 24K    -rw-r--r--  1 kk  staff  24473 Jun 10 13:26 Makefile.in
128K    drwxr-xr-x  8 kk  staff    272 Jun  9 16:03 build

Explanation:

The paste utility creates columns from its input according to the specification that you give it. Given two input files, it puts them side by side, with a tab as separator.

We give it the output of du -hs -- "$@" | cut -f1 as the first file (input stream really) and the output of ls -ld -- "$@" as the second file.

In the function, "$@" will evaluate to the list of all command line arguments, each in double quotes. It will therefore understand globbing characters and path names with spaces etc.

The double minuses (--) signals the end of command line options to du and ls. Without these, saying duls -l would confuse du and any option for du that ls doesn't have would confuse ls (and the options that exist in both utilities might not mean the same thing, and it would be a pretty mess).

The cut after du simply cuts out the first column of the du -hs output (the sizes).

I decided to put the du output on the left, otherwise I would have had to manage a wobbly right column (due to varying lengths of file names).

The command will not accept command line flags.

This has been tested in both bash and in ksh93. It will not work with /bin/sh.


For a while, I used Nautilus (on Gnome desktop on RHEL 6.0) to delete files on my home folder instead of using the rm command in bash. As a result, the total size shown by

du -sh

did not match the sum of disk usage of each sub-directory, when I used

du -sh *

It took me a while to realise Nautilus sends the deleted files to its Trash folder, and that folder is not listed in du -sh * command. So, just wanted to share this, in case somebody faced the same problem.


To list the largest directories from the current directory in human readable format:

du -sh * | sort -hr

A better way to restrict number of rows can be

du -sh * | sort -hr | head -n10

Where you can increase the suffix of -n flag to restrict the number of rows listed

Sample:

[~]$ du -sh * | sort -hr
48M app
11M lib
6.7M    Vendor
1.1M    composer.phar
488K    phpcs.phar
488K    phpcbf.phar
72K doc
16K nbproject
8.0K    composer.lock
4.0K    README.md

It makes it more convenient to read :)


du -h --max-depth=1 . | sort -n -r

du -sm * | sort -nr

Output by size


Retrieve only the size in bytes, from ls.

ls -ltr | head -n1 | cut -d' ' -f2

du -sk * | sort -n will sort the folders by size. Helpful when looking to clear space..


ncdu (ncurses du)

This awesome CLI utility allows you to easily find the large files and directories (recursive total size) interactively.

For example, from inside the root of a well known open source project we do:

sudo apt install ncdu
ncdu

The outcome its:

enter image description here

Then, I enter down and right on my keyboard to go into the /drivers folder, and I see:

enter image description here

ncdu only calculates file sizes recursively once at startup for the entire tree, so it is efficient.

"Total disk usage" vs "Apparent size" is analogous to du, and I have explained it at: why is the output of `du` often so different from `du -b`

Project homepage: https://dev.yorhel.nl/ncdu

Related questions:

Tested in Ubuntu 16.04.

Ubuntu list root

You likely want:

ncdu --exclude-kernfs -x /

where:

  • -x stops crossing of filesystem barriers
  • --exclude-kernfs skips special filesystems like /sys

MacOS 10.15.5 list root

To properly list root / on that system, I also needed --exclude-firmlinks, e.g.:

brew install ncdu
cd /
ncdu --exclude-firmlinks

otherwise it seemed to go into some link infinite loop, likely due to: https://www.swiftforensics.com/2019/10/macos-1015-volumes-firmlink-magic.html

The things we learn for love.

ncdu non-interactive usage

Another cool feature of ncdu is that you can first dump the sizes in a JSON format, and later reuse them.

For example, to generate the file run:

ncdu -o ncdu.json

and then examine it interactively with:

ncdu -f ncdu.json

This is very useful if you are dealing with a very large and slow filesystem like NFS.

This way, you can first export only once, which can take hours, and then explore the files, quit, explore again, etc.

The output format is just JSON, so it is easy to reuse it with other programs as well, e.g.:

ncdu -o -  | python -m json.tool | less

reveals a simple directory tree data structure:

[
    1,
    0,
    {
        "progname": "ncdu",
        "progver": "1.12",
        "timestamp": 1562151680
    },
    [
        {
            "asize": 4096,
            "dev": 2065,
            "dsize": 4096,
            "ino": 9838037,
            "name": "/work/linux-kernel-module-cheat/submodules/linux"
        },
        {
            "asize": 1513,
            "dsize": 4096,
            "ino": 9856660,
            "name": "Kbuild"
        },
        [
            {
                "asize": 4096,
                "dsize": 4096,
                "ino": 10101519,
                "name": "net"
            },
            [
                {
                    "asize": 4096,
                    "dsize": 4096,
                    "ino": 11417591,
                    "name": "l2tp"
                },
                {
                    "asize": 48173,
                    "dsize": 49152,
                    "ino": 11418744,
                    "name": "l2tp_core.c"
                },

Tested in Ubuntu 18.04.


Hmm, best way is to use this command:

du -h -x / | sort -hr >> /home/log_size.txt

Then you will be able to get all sizes folders over all your server. Easy to help to you to find the biggest sizes.


This is one I like

update: I didnt like the previous one because it didn't show files in the current directory, it only listed directories.

Example output for /var on ubuntu:

sudo du -hDaxd1 /var | sort -h | tail -n10

4.0K    /var/lock
4.0K    /var/run
4.0K    /var/www
12K     /var/spool
3.7M    /var/backups
33M     /var/log
45M     /var/webmin
231M    /var/cache
1.4G    /var/lib
1.7G    /var

type "ls -ltrh /path_to_directory"


du -sh * | sort -h

This will be displayed in human readable format.