Finding the layers and layer sizes for each Docker image

Question

For research purposes I m trying to crawl the public Docker registry   https   registry hub docker com    and find out 1  how many layers an average image has and 2  the sizes of these layers to get an idea of the distribution   However I studied the API and public libraries as well as the details on the github but I cant find any method to    retrieve all the public repositories images  even if those are thousands I still need a starting list to iterate through  find all the layers of an image find the size for a layer  so not an image but for the individual layer     Can anyone help me find a way to retrieve this information   Thank you   EDIT  is anyone able to verify that searching for     in Docker registry is returning all the repositories and not just anything that mentions     anywhere  https   registry hub docker com search q

User · Answer

Check out dive written in golang      Awesome tool

User · Answer

You can find the layers of the images in the folder  var lib docker aufs layers  provide if you configured for storage-driver as aufs  default option    Example    docker ps -a  CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                      PORTS               NAMES  0ca502fa6aae        ubuntu                bin bash          44 minutes ago      Exited  0  44 seconds ago                       DockerTest   Now to view the layers of the containers that were created with the image  Ubuntu   go to  var lib docker aufs layers directory and cat the file starts with the container ID  here it is 0ca502fa6aae     root viswesn-vm2  var lib docker aufs layers  cat    0ca502fa6aaefc89f690736609b54b2f0fdebfe8452902ca383020e3b0d266f9-init   d2a0ecffe6fa4ef3de9646a75cc629bbd9da7eead7f767cb810f9808d6b3ecb6  29460ac934423a55802fcad24856827050697b4a9f33550bd93c82762fb6db8f  b670fb0c7ecd3d2c401fbfd1fa4d7a872fbada0a4b8c2516d0be18911c6b25d6  83e4dde6b9cfddf46b75a07ec8d65ad87a748b98cf27de7d5b3298c1f3455ae4   This will show the result of same by running   root viswesn-vm2  var lib docker aufs layers  docker history ubuntu IMAGE               CREATED             CREATED BY                                         SIZE                COMMENT d2a0ecffe6fa        13 days ago          bin sh -c   nop  CMD    bin bash               0 B                  29460ac93442        13 days ago          bin sh -c sed -i  s    s      deb  universe       1 895 kB             b670fb0c7ecd        13 days ago          bin sh -c echo     bin sh   gt   usr sbin polic   194 5 kB             83e4dde6b9cf        13 days ago          bin sh -c   nop  ADD file c8f078961a543cdefa   188 2 MB    To view the full layer ID  run with --no-trunc option as part of history command   docker history --no-trunc ubuntu

User · Answer

https   hub docker com search q   shows all the images in the entire Docker hub  it s not possible to get this via the search command as it doesnt accept wildcards  As of v1 10 you can find all the layers in an image by pulling it and using these commands   docker pull ubuntu ID   sudo docker inspect -f    Id   ubuntu  jq  rootfs diff ids  var lib docker image aufs imagedb content   echo  ID tr             3  The size can be found in  var lib docker image aufs layerdb sha256  LAYERID  size although LAYERID    the diff ids found with the previous command  For this you need to look at  var lib docker image aufs layerdb sha256  LAYERID  diff and compare with the previous command output to properly match the correct diff id and size

User · Answer

This will inspect the docker image and print the layers     docker image inspect nginx -f     RootFS Layers     sha256 d626a8ad97a1f9c1f2c4db3814751ada64f60aed927764a3f994fcd88363b659 sha256 82b81d779f8352b20e52295afc6d0eab7e61c0ec7af96d85b8cda7800285d97d sha256 7ab428981537aa7d0c79bc1acbf208c71e57d9678f7deca4267cc03fba26b9c8

User · Answer

You can first find the image ID using    docker images -a  Then find the image s layers and their sizes    docker history --no-trunc  lt Image ID gt   Note  I m using Docker version 1 13 1   docker -v Docker version 1 13 1  build 092cba3

User · Answer

Not exactly the original question but to find the sum total of all the images without double-counting shared layers  the following is useful  ubuntu 18    sudo du -h -d1   var lib docker overlay2   sort -h

User · Answer

I ve solved this problem by using the search function on Docker s website where     is a valid search that returns 200k repositories and then I crawled each invididual page  HTML parsing allows me to extract all the image names on each page

User · Answer

one more tool   https   github com CenturyLinkLabs dockerfile-from-image  GUI using ImageLayers io

User · Answer

It s indeed doable to query the manifest or blob info from docker registry server without pulling the image to local disk   You can refer to the Registry v2 API to fetch the manifest of image   GET  v2  lt name gt  manifests  lt reference gt    Note  you have to handle different manifest version  For v2 you can directly get the size of layer and digest of blob  For v1 manifest  you can HEAD the blob download url to get the actual layer size   There is a simple script for handling above cases that will be continuously maintained

User · Answer

In my opinion  docker history  lt image gt  is sufficient  This returns the size of each layer     docker history jenkinsci-jnlp-slave 2019-1-9c IMAGE        CREATED    CREATED BY                                    SIZE  COMMENT 93f48953d298 42 min ago  bin sh -c   nop   USER jenkins               0B 6305b07d4650 42 min ago  bin sh -c chown jenkins jenkins -R  home je    1 45GB

User · Answer

They have a very good answer here  https   stackoverflow com a 32455275 165865  Just run below images   docker run --rm -v  var run docker sock  var run docker sock nate dockviz images -t

[image] Finding the layers and layer sizes for each Docker image

Examples related to image

Examples related to docker

Examples related to web-crawler