[docker] How to analyze disk usage of a Docker container

I can see that Docker takes 12GB of my filesystem:

2.7G    /var/lib/docker/vfs/dir
2.7G    /var/lib/docker/vfs
2.8G    /var/lib/docker/devicemapper/mnt
6.3G    /var/lib/docker/devicemapper/devicemapper
9.1G    /var/lib/docker/devicemapper
12G     /var/lib/docker

But, how do I know how this is distributed over the containers?

I tried to attach to the containers by running (the new v1.3 command)

docker exec -it <container_name> bash

and then running 'df -h' to analyze the disk usage. It seems to be working, but not with containers that use 'volumes-from'.

For example, I use a data-only container for MongoDB, called 'mongo-data'.

When I run docker run -it --volumes-from mongo-data busybox, and then df -h inside the container, It says that the filesystem mounted on /data/db (my 'mongo-data' data-only container) uses 11.3G, but when I do du -h /data/db, it says that it uses only 2.1G.

So, how do I analyze a container/volume disk usage? Or, in my case, how do I find out the 'mongo-data' container size?

This question is related to docker lxc device-mapper

The answer is


To see the file size of your containers, you can use the --size argument of docker ps:

docker ps --size

Alternative to docker ps --size

As "docker ps --size" produces heavy IO load on host, it is not feasable running such command every minute in a production environment. Therefore we have to do a workaround in order to get desired container size or to be more precise, the size of the RW-Layer with a low impact to systems perfomance.

This approach gathers the "device name" of every container and then checks size of it using "df" command. Those "device names" are thin provisioned volumes that a mounted to / on each container. One problem still persists as this observed size also implies all the readonly-layers of underlying image. In order to address this we can simple check size of used container image and substract it from size of a device/thin_volume.

One should note that every image layer is realized as a kind of a lvm snapshot when using device mapper. Unfortunately I wasn't able to get my rhel system to print out those snapshots/layers. Otherwise we could simply collect sizes of "latest" snapshots. Would be great if someone could make things clear. However...

After some tests, it seems that creation of a container always adds an overhead of approx. 40MiB (tested with containers based on Image "httpd:2.4.46-alpine"):

  1. docker run -d --name apache httpd:2.4.46-alpine // now get device name from docker inspect and look it up using df
  2. df -T -> 90MB whereas "Virtual Size" from "docker ps --size" states 50MB and a very small payload of 2Bytes -> mysterious overhead 40MB
  3. curl/download of a 100MB file within container
  4. df -T -> 190MB whereas "Virtual Size" from "docker ps --size" states 150MB and payload of 100MB -> overhead 40MB

Following shell prints results (in bytes) that match results from "docker ps --size" (but keep in mind mentioned overhead of 40MB)

for c in  $(docker ps -q); do \
container_name=$(docker inspect -f "{{.Name}}" ${c} | sed 's/^\///g' ); \
device_n=$(docker inspect -f "{{.GraphDriver.Data.DeviceName}}" ${c} | sed 's/.*-//g'); \
device_size_kib=$(df -T | grep ${device_n} | awk '{print $4}'); \
device_size_byte=$((1024 * ${device_size_kib})); \
image_sha=$(docker inspect -f "{{.Image}}" ${c} | sed 's/.*://g' ); \
image_size_byte=$(docker image inspect -f "{{.Size}}" ${image_sha}); \
container_size_byte=$((${device_size_byte} - ${image_size_byte})); \
\
echo my_node_dm_device_size_bytes\{cname=\"${container_name}\"\} ${device_size_byte}; \
echo my_node_dm_container_size_bytes\{cname=\"${container_name}\"\} ${container_size_byte}; \
echo my_node_dm_image_size_bytes\{cname=\"${container_name}\"\} ${image_size_byte}; \
done

Further reading about device mapper: https://test-dockerrr.readthedocs.io/en/latest/userguide/storagedriver/device-mapper-driver/


Keep in mind that docker ps --size may be an expensive command, taking more than a few minutes to complete. The same applies to container list API requests with size=1. It's better not to run it too often.

Take a look at alternatives we compiled, including the du -hs option for the docker persistent volume directory.


After 1.13.0, Docker includes a new command docker system df to show docker disk usage.

$ docker system df
TYPE            TOTAL        ACTIVE     SIZE        RECLAIMABLE
Images          5            1          2.777 GB    2.647 GB (95%)
Containers      1            1          0 B         0B
Local Volumes   4            1          3.207 GB    2.261 (70%)

To show more detailed information on space usage:

$ docker system df --verbose

You can use

docker history IMAGE_ID

to see how the image size is ditributed between its various sub-components.


The volume part did not work anymore so if anyone is insterested I just change the above script a little bit:

for d in `docker ps | awk '{print $1}' | tail -n +2`; do
    d_name=`docker inspect -f {{.Name}} $d`
    echo "========================================================="
    echo "$d_name ($d) container size:"
    sudo du -d 2 -h /var/lib/docker/aufs | grep `docker inspect -f "{{.Id}}" $d`
    echo "$d_name ($d) volumes:"
    for mount in `docker inspect -f "{{range .Mounts}} {{.Source}}:{{.Destination}}                                                                                                                                                      
    {{end}}" $d`; do
        size=`echo $mount | cut -d':' -f1 | sudo xargs du -d 0 -h`
        mnt=`echo $mount | cut -d':' -f2`
        echo "$size mounted on $mnt"
    done
done

(this answer is not useful, but leaving it here since some of the comments may be)

docker images will show the 'virtual size', i.e. how much in total including all the lower layers. So some double-counting if you have containers that share the same base image.

documentation


I use docker stats $(docker ps --format={{.Names}}) --no-stream to get :

  1. CPU usage,
  2. Mem usage/Total mem allocated to container (can be allocate with docker run command)
  3. Mem %
  4. Block I/O
  5. Net I/O

Posting this as an answer because my comments above got hidden:

List the size of a container:

du -d 2 -h /var/lib/docker/devicemapper | grep `docker inspect -f "{{.Id}}" <container_name>`

List the sizes of a container's volumes:

docker inspect -f "{{.Volumes}}" <container_name> | sed 's/map\[//' | sed 's/]//' | tr ' ' '\n' | sed 's/.*://' | xargs sudo du -d 1 -h

Edit: List all running containers' sizes and volumes:

for d in `docker ps -q`; do
    d_name=`docker inspect -f {{.Name}} $d`
    echo "========================================================="
    echo "$d_name ($d) container size:"
    sudo du -d 2 -h /var/lib/docker/devicemapper | grep `docker inspect -f "{{.Id}}" $d`
    echo "$d_name ($d) volumes:"
    docker inspect -f "{{.Volumes}}" $d | sed 's/map\[//' | sed 's/]//' | tr ' ' '\n' | sed 's/.*://' | xargs sudo du -d 1 -h
done

NOTE: Change 'devicemapper' according to your Docker filesystem (e.g 'aufs')


Improving Maxime's anwser:

docker ps --size

You'll see something like this:

+---------------+---------------+--------------------+
| CONTAINER ID  | IMAGE         | SIZE               |
+===============+===============+====================+
| 6ca0cef8db8d  | nginx         | 2B (virtual 183MB) |
| 3ab1a4d8dc5a  | nginx         | 5B (virtual 183MB) |
+---------------+---------------+--------------------+

When starting a container, the image that the container is started from is mounted read-only (virtual).
On top of that, a writable layer is mounted, in which any changes made to the container are written.

So the Virtual size (183MB in the example) is used only once, regardless of how many containers are started from the same image - I can start 1 container or a thousand; no extra disk space is used.
The "Size" (2B in the example) is unique per container though, so the total space used on disk is:

183MB + 5B + 2B

Be aware that the size shown does not include all disk space used for a container.
Things that are not included currently are;
- volumes
- swapping
- checkpoints
- disk space used for log-files generated by container

https://github.com/docker/docker.github.io/issues/1520#issuecomment-305179362