[docker] Is it safe to clean docker/overlay2/

I got some docker containers running on AWS EC2, the /var/lib/docker/overlay2 folder grows very fast in disk size.

I'm wondering if it is safe to delete its content? or if docker has some kind of command to free up some disk usage.


UPDATE:

I actually tried docker system prune -a already, which reclaimed 0Kb.

Also my /docker/overlay2 disk size is much larger than the output from docker system df

After reading docker documentation and BMitch's answer, I believe it is a stupid idea to touch this folder and I will try other ways to reclaim my disk space.

This question is related to docker docker-compose docker-machine

The answer is


I recently had a similar issue, overlay2 grew bigger and bigger, But I couldn’t figure out what consumed the bulk of the space.

df showed me that overlay2 was about 24GB in size.

With du I tried to figure out what occupied the space… and failed.

The difference came from the fact that deleted files (mostly log files in my case) where still being used by a process (Docker). Thus the file doesn’t show up with du but the space it occupies will show with df.

A reboot of the host machine helped. Restarting the docker container would probably have helped already… This article on linuxquestions.org helped me to figure that out.


Everything in /var/lib/docker are filesystems of containers. If you stop all your containers and prune them, you should end up with the folder being empty. You probably don't really want that, so don't go randomly deleting stuff in there. Do not delete things in /var/lib/docker directly. You may get away with it sometimes, but it's inadvisable for so many reasons.

Do this instead:

sudo bash
cd /var/lib/docker
find . -type f | xargs du -b  | sort -n

What you will see is the largest files shown at the bottom. If you want, figure out what containers those files are in, enter those containers with docker exec -ti containername -- /bin/sh and delete some files.

You can also put docker system prune -a -f on a daily/weekly cron job as long as you aren't leaving stopped containers and volumes around that you care about. It's better to figure out the reasons why it's growing, and correct them at the container level.


I found this worked best for me:

docker image prune --all

By default Docker will not remove named images, even if they are unused. This command will remove unused images.

Note each layer in an image is a folder inside the /usr/lib/docker/overlay2/ folder.


also had problems with rapidly growing overlay2

/var/lib/docker/overlay2 - is a folder where docker store writable layers for your container. docker system prune -a - may work only if container is stopped and removed.

in my i was able to figure out what consumes space by going into overlay2 and investigating.

that folder contains other hash named folders. each of those has several folders including diff folder.

diff folder - contains actual difference written by a container with exact folder structure as your container (at least it was in my case - ubuntu 18...)

so i've used du -hsc /var/lib/docker/overlay2/LONGHASHHHHHHH/diff/tmp to figure out that /tmp inside of my container is the folder which gets polluted.

so as a workaround i've used -v /tmp/container-data/tmp:/tmp parameter for docker run command to map inner /tmp folder to host and setup a cron on host to cleanup that folder.

cron task was simple:

  • sudo nano /etc/crontab
  • */30 * * * * root rm -rf /tmp/container-data/tmp/*
  • save and exit

NOTE: overlay2 is system docker folder, and they may change it structure anytime. Everything above is based on what i saw in there. Had to go in docker folder structure only because system was completely out of space and even wouldn't allow me to ssh into docker container.


DON'T DO THIS IN PRODUCTION

The answer given by @ravi-luthra technically works but it has some issues!

In my case, I was just trying to recover disk space. The lib/docker/overlay folder was taking 30GB of space and I only run a few containers regularly. Looks like docker has some issue with data leakage and some of the temporary data are not cleared when the container stops.

So I went ahead and deleted all the contents of lib/docker/overlay folder. After that, My docker instance became un-useable. When I tried to run or build any container, It gave me this error:

failed to create rwlayer: symlink ../04578d9f8e428b693174c6eb9a80111c907724cc22129761ce14a4c8cb4f1d7c/diff /var/lib/docker/overlay2/l/C3F33OLORAASNIYB3ZDATH2HJ7: no such file or directory

Then with some trial and error, I solved this issue by running

(WARNING: This will delete all your data inside docker volumes)

docker system prune --volumes -a

So It is not recommended to do such dirty clean ups unless you completely understand how the system works.


Friends, to keep everything clean you can use de commands:

  • docker system prune -a && docker volume prune.

I had this issue... It was the log that was huge. Logs are here :

/var/lib/docker/containers/<container id>/<container id>-json.log

You can manage this in the run command line or in the compose file. See there : Configure logging drivers

I personally added these 3 lines to my docker-compose.yml file :

my_container:
  logging:
    options:
      max-size: 10m

Backgroud

The blame for the issue can be split between our misconfiguration of container volumes, and a problem with docker leaking (failing to release) temporary data written to these volumes. We should be mapping (either to host folders or other persistent storage claims) all of out container's temporary / logs / scratch folders where our apps write frequently and/or heavily. Docker does not take responsibility for the cleanup of all automatically created so-called EmptyDirs located by default in /var/lib/docker/overlay2/*/diff/*. Contents of these "non-persistent" folders should be purged automatically by docker after container is stopped, but apparently are not (they may be even impossible to purge from the host side if the container is still running - and it can be running for months at a time).

Workaround

A workaround requires careful manual cleanup, and while already described elsewhere, you still may find some hints from my case study, which I tried to make as instructive and generalizable as possible.

So what happened is the culprit app (in my case clair-scanner) managed to write over a few months hundreds of gigs of data to the /diff/tmp subfolder of docker's overlay2

du -sch /var/lib/docker/overlay2/<long random folder name seen as bloated in df -haT>/diff/tmp

271G total

So as all those subfolders in /diff/tmp were pretty self-explanatory (all were of the form clair-scanner-* and had obsolete creation dates), I stopped the associated container (docker stop clair) and carefully removed these obsolete subfolders from diff/tmp, starting prudently with a single (oldest) one, and testing the impact on docker engine (which did require restart [systemctl restart docker] to reclaim disk space):

rm -rf $(ls -at /var/lib/docker/overlay2/<long random folder name seen as bloated in df -haT>/diff/tmp | grep clair-scanner | tail -1)

I reclaimed hundreds of gigs of disk space without the need to re-install docker or purge its entire folders. All running containers did have to be stopped at one point, because docker daemon restart was required to reclaim disk space, so make sure first your failover containers are running correctly on an/other node/s). I wish though that the docker prune command could cover the obsolete /diff/tmp (or even /diff/*) data as well (via yet another switch).

It's a 3-year-old issue now, you can read its rich and colorful history on Docker forums, where a variant aimed at application logs of the above solution was proposed in 2019 and seems to have worked in several setups: https://forums.docker.com/t/some-way-to-clean-up-identify-contents-of-var-lib-docker-overlay/30604


I used "docker system prune -a" it cleaned all files under volumes and overlay2

    [root@jasontest volumes]# docker system prune -a
    WARNING! This will remove:
            - all stopped containers
            - all networks not used by at least one container
            - all images without at least one container associated to them
            - all build cache
    Are you sure you want to continue? [y/N] y
    Deleted Images:
    untagged: ubuntu:12.04
    untagged: ubuntu@sha256:18305429afa14ea462f810146ba44d4363ae76e4c8dfc38288cf73aa07485005
    deleted: sha256:5b117edd0b767986092e9f721ba2364951b0a271f53f1f41aff9dd1861c2d4fe
    deleted: sha256:8c7f3d7534c80107e3a4155989c3be30b431624c61973d142822b12b0001ece8
    deleted: sha256:969d5a4e73ab4e4b89222136eeef2b09e711653b38266ef99d4e7a1f6ea984f4
    deleted: sha256:871522beabc173098da87018264cf3e63481628c5080bd728b90f268793d9840
    deleted: sha256:f13e8e542cae571644e2f4af25668fadfe094c0854176a725ebf4fdec7dae981
    deleted: sha256:58bcc73dcf4050a4955916a0dcb7e5f9c331bf547d31e22052f1b5fa16cf63f8
    untagged: osixia/openldap:1.2.1
    untagged: osixia/openldap@sha256:6ceb347feb37d421fcabd80f73e3dc6578022d59220cab717172ea69c38582ec
    deleted: sha256:a562f6fd60c7ef2adbea30d6271af8058c859804b2f36c270055344739c06d64
    deleted: sha256:90efa8a88d923fb1723bea8f1082d4741b588f7fbcf3359f38e8583efa53827d
    deleted: sha256:8d77930b93c88d2cdfdab0880f3f0b6b8be191c23b04c61fa1a6960cbeef3fe6
    deleted: sha256:dd9f76264bf3efd36f11c6231a0e1801c80d6b4ca698cd6fa2ff66dbd44c3683
    deleted: sha256:00efc4fb5e8a8e3ce0cb0047e4c697646c88b68388221a6bd7aa697529267554
    deleted: sha256:e64e6259fd63679a3b9ac25728f250c3afe49dbe457a1a80550b7f1ccf68458a
    deleted: sha256:da7d34d626d2758a01afe816a9434e85dffbafbd96eb04b62ec69029dae9665d
    deleted: sha256:b132dace06fa7e22346de5ca1ae0c2bf9acfb49fe9dbec4290a127b80380fe5a
    deleted: sha256:d626a8ad97a1f9c1f2c4db3814751ada64f60aed927764a3f994fcd88363b659
    untagged: centos:centos7
    untagged: centos@sha256:2671f7a3eea36ce43609e9fe7435ade83094291055f1c96d9d1d1d7c0b986a5d
    deleted: sha256:ff426288ea903fcf8d91aca97460c613348f7a27195606b45f19ae91776ca23d
    deleted: sha256:e15afa4858b655f8a5da4c4a41e05b908229f6fab8543434db79207478511ff7

    Total reclaimed space: 533.3MB
    [root@jasontest volumes]# ls -alth
    total 32K
    -rw-------  1 root root  32K May 23 21:14 metadata.db
    drwx------  2 root root 4.0K May 23 21:14 .
    drwx--x--x 14 root root 4.0K May 21 20:26 ..

adding to above comment, in which people are suggesting to prune system like clear dangling volumes, images, exit containers etc., Sometime your app become culprit, it generated too much logs in a small time and if you using empty directy volume (local volumes) this fill the /var partitions. In that case I found below command very interesting to figure out, what is consuming space on my /var partition disk.

du -ahx /var/lib | sort -rh | head -n 30

This command will list top 30, which is consuming most space on a single disk. Means if you are using external storage with your containers, it consumes a lot of time to run du command. This command will not count mount volumes. And is much faster. You will get the exact directories/files which are consuming space. Then you can go to those directories and check which files are useful or not. if these files are required then you can move them to some persistent storage by making change in app to use persistent storage for that location or change location of that files. And for rest you can clear them.


WARNING: DO NOT USE IN A PRODUCTION SYSTEM

/# df
...
/dev/xvda1      51467016 39384516   9886300  80% /
...

Ok, let's first try system prune

#/ docker system prune --volumes
...
/# df
...
/dev/xvda1      51467016 38613596  10657220  79% /
...

Not so great, seems like it cleaned up a few megabytes. Let's go crazy now:

/# sudo su
/# service docker stop
/# cd /var/lib/docker
/var/lib/docker# rm -rf *
/# service docker start
/var/lib/docker# df
...
/dev/xvda1      51467016 8086924  41183892  17% /
...

Nice! Just remember that this is NOT recommended in anything but a throw-away server. At this point Docker's internal database won't be able to find any of these overlays and it may cause unintended consequences.


Examples related to docker

standard_init_linux.go:190: exec user process caused "no such file or directory" - Docker What is the point of WORKDIR on Dockerfile? E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation How do I add a user when I'm using Alpine as a base image? docker: Error response from daemon: Get https://registry-1.docker.io/v2/: Service Unavailable. IN DOCKER , MAC How to fix docker: Got permission denied issue pull access denied repository does not exist or may require docker login Docker error: invalid reference format: repository name must be lowercase Docker: "no matching manifest for windows/amd64 in the manifest list entries" OCI runtime exec failed: exec failed: (...) executable file not found in $PATH": unknown

Examples related to docker-compose

E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation How to upgrade docker-compose to latest version How to fix docker: Got permission denied issue Docker error: invalid reference format: repository name must be lowercase Is it safe to clean docker/overlay2/ Docker: How to delete all local Docker images Docker "ERROR: could not find an available, non-overlapping IPv4 address pool among the defaults to assign to the network" How to run docker-compose up -d at system start up? How to create a DB for MongoDB container on start up? How to use local docker images with Minikube?

Examples related to docker-machine

Is it safe to clean docker/overlay2/ How to clear the logs properly for a Docker container? Command for restarting all running docker containers? How do I run a docker instance from a DockerFile? Docker is in volume in use, but there aren't any Docker containers