dd How to calculate optimal blocksize

Question

How do you calculate the optimal blocksize when running a dd  I ve researched it a bit and I ve not found anything suggesting how this would be accomplished   I am under the impression that a larger blocksize would result in a quicker dd    is this true   I m about to dd two identical 500gb Hitachi HDDs that run at 7200rpm on a box running an Intel Core i3 with 4GB DDR3 1333mhz RAM  so I m trying to figure out what blocksize to use   I m going to be booting Ubuntu 10 10 x86 from a flash drive  and running it from that

User · Accepted Answer

The optimal block size depends on various factors, including the operating system (and its version), and the various hardware buses and disks involved. Several Unix-like systems (including Linux and at least some flavors of BSD) define the st_blksize member in the struct stat that gives what the kernel thinks is the optimal block size:

#include <sys/stat.h>
#include <stdio.h>

int main(void)
{
    struct stat stats;

    if (!stat("/", &stats))
    {
        printf("%u\n", stats.st_blksize);
    }
}

The best way may be to experiment: copy a gigabyte with various block sizes and time that. (Remember to clear kernel buffer caches before each run: echo 3 > /proc/sys/vm/drop_caches).

However, as a rule of thumb, I've found that a large enough block size lets dd do a good job, and the differences between, say, 64 KiB and 1 MiB are minor, compared to 4 KiB versus 64 KiB. (Though, admittedly, it's been a while since I did that. I use a mebibyte by default now, or just let dd pick the size.)

User · Answer

This is totally system dependent  You should experiment to find the optimum solution  Try starting with bs 8388608   As Hitachi HDDs seems to have 8MB cache

User · Answer

for better performace use the biggest blocksize you RAM can accomodate  will send less I O calls to the OS  for better accurancy and data recovery set the blocksize to the native sector size of the input      As dd copies data with the conv noerror sync option  any errors it encounters will result in the remainder of the block being replaced with zero-bytes  Larger block sizes will copy more quickly  but each time an error is encountered the remainder of the block is ignored    source

User · Answer

I ve found my optimal blocksize to be 8 MB  equal to disk cache   I needed to wipe  some say  wash  the empty space on a disk before creating a compressed image of it  I used   cd  media DiskToWash  dd if  dev zero of zero bs 8M  rm zero   I experimented with values from 4K to 100M   After letting dd to run for a while I killed it  Ctlr C  and read the output   36 0 records in 36 0 records out 301989888 bytes  302 MB  copied  15 8341 s  19 1 MB s   As dd displays the input output rate  19 1MB s in this case  it s easy to see if the value you ve picked is performing better than the previous one or worse   My scores   bs    I O rate --------------- 4K    13 5 MB s 64K   18 3 MB s 8M    19 1 MB s  lt --- winner  10M   19 0 MB s 20M   18 6 MB s 100M  18 6 MB s      Note  To check what your disk cache buffer size is  you can use sudo hdparm -i  dev sda

User · Answer

As others have said  there is no universally correct block size  what is optimal for one situation or one piece of hardware may be terribly inefficient for another  Also  depending on the health of the disks it may be preferable to use a different block size than what is  optimal    One thing that is pretty reliable on modern hardware is that the default block size of 512 bytes tends to be almost an order of magnitude slower than a more optimal alternative  When in doubt  I ve found that 64K is a pretty solid modern default  Though 64K usually isn t THE optimal block size  in my experience it tends to be a lot more efficient than the default  64K also has a pretty solid history of being reliably performant  You can find a message from the Eug-Lug mailing list  circa 2002  recommending a block size of 64K here  http   www mail-archive com eug-lug efn org msg12073 html  For determining THE optimal output block size  I ve written the following script that tests writing a 128M test file with dd at a range of different block sizes  from the default of 512 bytes to a maximum of 64M  Be warned  this script uses dd internally  so use with caution   dd obs test sh      bin bash    Since we re dealing with dd  abort if any errors occur set -e  TEST FILE   1 -dd obs testfile  TEST FILE EXISTS 0 if   -e   TEST FILE     then TEST FILE EXISTS 1  fi TEST FILE SIZE 134217728  if    EUID -ne 0    then   echo  NOTE  Kernel cache will not be cleared between tests without sudo  This will likely cause inaccurate results   1 gt  amp 2 fi    Header PRINTF FORMAT   8s    s n  printf   PRINTF FORMAT   block size   transfer rate     Block sizes of 512b 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M 8M 16M 32M 64M for BLOCK SIZE in 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 33554432 67108864 do     Calculate number of segments required to copy   COUNT     TEST FILE SIZE    BLOCK SIZE      if    COUNT -le 0    then     echo  Block size of  BLOCK SIZE estimated to require  COUNT blocks  aborting further tests       break   fi      Clear kernel cache to ensure more accurate test      EUID -eq 0    amp  amp    -e  proc sys vm drop caches    amp  amp  echo 3  gt   proc sys vm drop caches      Create a test file with the specified block size   DD RESULT   dd if  dev zero of  TEST FILE bs  BLOCK SIZE count  COUNT conv fsync 2 gt  amp 1 1 gt  dev null       Extract the transfer rate from dd s STDERR output   TRANSFER RATE   echo  DD RESULT    grep --only-matching -E   0-9      MGk  B bytes  s ec          Clean up the test file if we created one   if    TEST FILE EXISTS -ne 0    then rm  TEST FILE  fi      Output the result   printf   PRINTF FORMAT    BLOCK SIZE    TRANSFER RATE  done   View on GitHub  I ve only tested this script on a Debian  Ubuntu  system and on OSX Yosemite  so it will probably take some tweaking to make work on other Unix flavors   By default the command will create a test file named dd obs testfile in the current directory  Alternatively  you can provide a path to a custom test file by providing a path after the script name       dd obs test sh  path to disk test file   The output of the script is a list of the tested block sizes and their respective transfer rates like so       dd obs test sh block size   transfer rate        512   11 3 MB s       1024   22 1 MB s       2048   42 3 MB s       4096   75 2 MB s       8192   90 7 MB s      16384   101 MB s      32768   104 MB s      65536   108 MB s     131072   113 MB s     262144   112 MB s     524288   133 MB s    1048576   125 MB s    2097152   113 MB s    4194304   106 MB s    8388608   107 MB s   16777216   110 MB s   33554432   119 MB s   67108864   134 MB s    Note  The unit of the transfer rates will vary by OS   To test optimal read block size  you could use more or less the same process  but instead of reading from  dev zero and writing to the disk  you d read from the disk and write to  dev null  A script to do this might look like so   dd ibs test sh      bin bash    Since we re dealing with dd  abort if any errors occur set -e  TEST FILE   1 -dd ibs testfile  if   -e   TEST FILE     then TEST FILE EXISTS     fi TEST FILE SIZE 134217728    Exit if file exists if   -e  TEST FILE    then   echo  Test file  TEST FILE exists  aborting     exit 1 fi TEST FILE EXISTS 1  if    EUID -ne 0    then   echo  NOTE  Kernel cache will not be cleared between tests without sudo  This will likely cause inaccurate results   1 gt  amp 2 fi    Create test file echo  Generating test file     BLOCK SIZE 65536 COUNT     TEST FILE SIZE    BLOCK SIZE   dd if  dev urandom of  TEST FILE bs  BLOCK SIZE count  COUNT conv fsync  gt   dev null 2 gt  amp 1    Header PRINTF FORMAT   8s    s n  printf   PRINTF FORMAT   block size   transfer rate     Block sizes of 512b 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M 8M 16M 32M 64M for BLOCK SIZE in 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 33554432 67108864 do     Clear kernel cache to ensure more accurate test      EUID -eq 0    amp  amp    -e  proc sys vm drop caches    amp  amp  echo 3  gt   proc sys vm drop caches      Read test file out to  dev null with specified block size   DD RESULT   dd if  TEST FILE of  dev null bs  BLOCK SIZE 2 gt  amp 1 1 gt  dev null       Extract transfer rate   TRANSFER RATE   echo  DD RESULT    grep --only-matching -E   0-9      MGk  B bytes  s ec        printf   PRINTF FORMAT    BLOCK SIZE    TRANSFER RATE  done    Clean up the test file if we created one if    TEST FILE EXISTS -ne 0    then rm  TEST FILE  fi   View on GitHub  An important difference in this case is that the test file is a file that is written by the script  Do not point this command at an existing file or the existing file will be overwritten with zeroes   For my particular hardware I found that 128K was the most optimal input block size on a HDD and 32K was most optimal on a SSD   Though this answer covers most of my findings  I ve run into this situation enough times that I wrote a blog post about it  http   blog tdg5 com tuning-dd-block-size  You can find more specifics on the tests I performed there

[linux] dd: How to calculate optimal blocksize?

Examples related to linux

Examples related to dd