[linux] How to list the size of each file and directory and sort by descending size in Bash?

I found that there is no easy to get way the size of a directory in Bash?

I want that when I type ls -<some options>, it can list of all the sum of the file size of directory recursively and files at the same time and sort by size order.

Is that possible?

This question is related to linux file bash

The answer is


Another simple solution.

$ for entry in $(ls); do du -s "$entry"; done | sort -n

the result will look like

2900    tmp
6781    boot
8428    bin
24932   lib64
34436   sbin
90084   var
106676  etc
125216  lib
3313136 usr
4828700 opt

changing "du -s" to "du -sh" will show human readable size, but we won't be able to sort in this method.


I tend to use du in a simple way.

du -sh */ | sort -n

This provides me with an idea of what directories are consuming the most space. I can then run more precise searches later.


[enhanced version]
This is going to be much faster and precise than the initial version below and will output the sum of all the file size of current directory:

echo `find . -type f -exec stat -c %s {} \; | tr '\n' '+' | sed 's/+$//g'` | bc

the stat -c %s command on a file will return its size in bytes. The tr command here is used to overcome xargs command limitations (apparently piping to xargs is splitting results on more lines, breaking the logic of my command). Hence tr is taking care of replacing line feed with + (plus) sign. sed has the only goal to remove the last + sign from the resulting string to avoid complains from the final bc (basic calculator) command that, as usual, does the math.

Performances: I tested it on several directories and over ~150.000 files top (the current number of files of my fedora 15 box) having what I believe it is an amazing result:

# time echo `find / -type f -exec stat -c %s {} \; | tr '\n' '+' | sed 's/+$//g'` | bc
12671767700

real    2m19.164s
user    0m2.039s
sys 0m14.850s

Just in case you want to make a comparison with the du -sb / command, it will output an estimated disk usage in bytes (-b option)

# du -sb /
12684646920 /

As I was expecting it is a little larger than my command calculation because the du utility returns allocated space of each file and not the actual consumed space.

[initial version]
You cannot use du command if you need to know the exact sum size of your folder because (as per man page citation) du estimates file space usage. Hence it will lead you to a wrong result, an approximation (maybe close to the sum size but most likely greater than the actual size you are looking for).

I think there might be different ways to answer your question but this is mine:

ls -l $(find . -type f | xargs) | cut -d" " -f5 | xargs | sed 's/\ /+/g'| bc

It finds all files under . directory (change . with whatever directory you like), also hidden files are included and (using xargs) outputs their names in a single line, then produces a detailed list using ls -l. This (sometimes) huge output is piped towards cut command and only the fifth field (-f5), which is the file size in bytes is taken and again piped against xargs which produces again a single line of sizes separated by blanks. Now take place a sed magic which replaces each blank space with a plus (+) sign and finally bc (basic calculator) does the math.

It might need additional tuning and you may have ls command complaining about arguments list too long.


you can use the below to list files by size du -h | sort -hr | more or du -h --max-depth=0 * | sort -hr | more


Command

du -h --max-depth=0 * | sort -hr

Output

3,5M    asdf.6000.gz
3,4M    asdf.4000.gz
3,2M    asdf.2000.gz
2,5M    xyz.PT.gz
136K    xyz.6000.gz
116K    xyz.6000p.gz
88K test.4000.gz
76K test.4000p.gz
44K test.2000.gz
8,0K    desc.common.tcl
8,0K    wer.2000p.gz
8,0K    wer.2000.gz
4,0K    ttree.3

Explanation

  • du displays "disk usage"
  • h is for "human readable" (both, in sort and in du)
  • max-depth=0 means du will not show sizes of subfolders (remove that if you want to show all sizes of every file in every sub-, subsub-, ..., folder)
  • r is for "reverse" (biggest file first)

ncdu

When I came to this question, I wanted to clean up my file system. The command line tool ncdu is way better suited to this task.

Installation on Ubuntu:

$ sudo apt-get install ncdu

Usage:

Just type ncdu [path] in the command line. After a few seconds for analyzing the path, you will see something like this:

$ ncdu 1.11 ~ Use the arrow keys to navigate, press ? for help
--- / ---------------------------------------------------------
.  96,1 GiB [##########] /home
.  17,7 GiB [#         ] /usr
.   4,5 GiB [          ] /var
    1,1 GiB [          ] /lib
  732,1 MiB [          ] /opt
. 275,6 MiB [          ] /boot
  198,0 MiB [          ] /storage
. 153,5 MiB [          ] /run
.  16,6 MiB [          ] /etc
   13,5 MiB [          ] /bin
   11,3 MiB [          ] /sbin
.   8,8 MiB [          ] /tmp
.   2,2 MiB [          ] /dev
!  16,0 KiB [          ] /lost+found
    8,0 KiB [          ] /media
    8,0 KiB [          ] /snap
    4,0 KiB [          ] /lib64
e   4,0 KiB [          ] /srv
!   4,0 KiB [          ] /root
e   4,0 KiB [          ] /mnt
e   4,0 KiB [          ] /cdrom
.   0,0   B [          ] /proc
.   0,0   B [          ] /sys
@   0,0   B [          ]  initrd.img.old
@   0,0   B [          ]  initrd.img
@   0,0   B [          ]  vmlinuz.old
@   0,0   B [          ]  vmlinuz

Delete the currently highlighted element with d, exit with CTRL + c


ls -S sorts by size. Then, to show the size too, ls -lS gives a long (-l), sorted by size (-S) display. I usually add -h too, to make things easier to read, so, ls -lhS.


du -s -- * | sort -n

(this willnot show hidden (.dotfiles) files)

Use du -sm for Mb units etc. I always use

du -smc -- * | sort -n

because the total line (-c) will end up at the bottom for obvious reasons :)

PS:

  • See comments for handling dotfiles
  • I frequently use e.g. 'du -smc /home// | sort -n |tail' to get a feel of where exactly the large bits are sitting

Simple and fast:

find . -mindepth 1 -maxdepth 1 -type d | parallel du -s | sort -n

*requires GNU Parallel.


I think I might have figured out what you want to do. This will give a sorted list of all the files and all the directories, sorted by file size and size of the content in the directories.

(find . -depth 1 -type f -exec ls -s {} \;; find . -depth 1 -type d -exec du -s {} \;) | sort -n

Apparently --max-depth option is not in Mac OS X's version of the du command. You can use the following instead.

du -h -d 1 | sort -n


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to file

Gradle - Move a folder from ABC to XYZ Difference between opening a file in binary vs text Angular: How to download a file from HttpClient? Python error message io.UnsupportedOperation: not readable java.io.FileNotFoundException: class path resource cannot be opened because it does not exist Writing JSON object to a JSON file with fs.writeFileSync How to read/write files in .Net Core? How to write to a CSV line by line? Writing a dictionary to a text file? What are the pros and cons of parquet format compared to other formats?

Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?