Quickly create a large file on a Linux system

Question

How can I quickly create a large file on a Linux  Red Hat Linux  system   dd will do the job  but reading from  dev zero and writing to the drive can take a long time when you need a file several hundreds of GBs in size for testing    If you need to do that repeatedly  the time really adds up   I don t care about the contents of the file  I just want it to be created quickly  How can this be done   Using a sparse file won t work for this  I need the file to be allocated disk space

User · Answer

This is the fastest I could do  which is not fast  with the following constraints    The goal of the large file is to fill a disk  so can t be compressible  Using ext3 filesystem   fallocate not available    This is the gist of it        include stdlib h  stdio h  and stdint h int32 t buf 256      Block size  for  int i   0  i  lt  256    i        buf i    rand       random to be non-compressible    FILE  file   fopen   file on your system    wb    int blocksToWrite   1024   1024     1 GB for  int i   0  i  lt  blocksToWrite    i       fwrite buf  sizeof int32 t   256  file       In our case this is for an embedded linux system and this works well enough  but would prefer something faster   FYI the command dd if  dev urandom of outputfile bs 1024 count   XX was so slow as to be unusable

User · Answer

To make a 1 GB file   dd if  dev zero of filename bs 1G count 1

User · Answer

dd from the other answers is a good solution  but it is slow for this purpose  In Linux  and other POSIX systems   we have fallocate  which uses the desired space without having to actually writing to it  works with most modern disk based file systems  very fast   For example   fallocate -l 10G gentoo root img

User · Answer

Linux  amp  all filesystems  xfs mkfile 10240m 10Gigfile  Linux  amp  and some filesystems  ext4  xfs  btrfs and ocfs2   fallocate -l 10G 10Gigfile  OS X  Solaris  SunOS and probably other UNIXes  mkfile 10240m 10Gigfile  HP-UX  prealloc 10Gigfile 10737418240  Explanation  Try mkfile  lt size gt  myfile as an alternative of dd  With the -n option the size is noted  but  disk blocks aren t allocated until data is written to them   Without the -n option  the space is zero-filled  which means writing to the disk  which means taking time    mkfile is derived from SunOS and is not available everywhere  Most Linux systems have xfs mkfile which works exactly the same way  and not just on XFS file systems despite the name  It s included in xfsprogs  for Debian Ubuntu  or similar named packages   Most Linux systems also have fallocate  which only works on certain file systems  such as btrfs  ext4  ocfs2  and xfs   but is the fastest  as it allocates all the file space  creates non-holey files  but does not initialize any of it

User · Answer

The GPL mkfile is just a  ba sh script wrapper around dd  BSD s mkfile just memsets a buffer with non-zero and writes it repeatedly   I would not expect the former to out-perform dd   The latter might edge out dd if  dev zero slightly since it omits the reads  but anything that does significantly better is probably just creating a sparse file   Absent a system call that actually allocates space for a file without writing data  and Linux and BSD lack this  probably Solaris as well  you might get a small improvement in performance by using ftrunc 2  truncate 1  to extend the file to the desired size  mmap the file into memory  then write non-zero data to the first bytes of every disk block  use fgetconf to find the disk block size

User · Answer

Where seek is the size of the file you want in bytes - 1   dd if  dev zero of filename bs 1 count 1 seek 1048575

User · Answer

The GPL mkfile is just a  ba sh script wrapper around dd  BSD s mkfile just memsets a buffer with non-zero and writes it repeatedly   I would not expect the former to out-perform dd   The latter might edge out dd if  dev zero slightly since it omits the reads  but anything that does significantly better is probably just creating a sparse file   Absent a system call that actually allocates space for a file without writing data  and Linux and BSD lack this  probably Solaris as well  you might get a small improvement in performance by using ftrunc 2  truncate 1  to extend the file to the desired size  mmap the file into memory  then write non-zero data to the first bytes of every disk block  use fgetconf to find the disk block size

User · Answer

truncate -s 10M output file  will create a 10 M file instantaneously  M stands for 10241024 bytes  MB stands for 10001000 - same with K  KB  G  GB     EDIT  as many have pointed out  this will not physically allocate the file on your device  With this you could actually create an arbitrary large file  regardless of the available space on the device  as it creates a  quot sparse quot  file  For e g  notice no HDD space is consumed with this command      BEFORE   df -h   grep lvm  dev mapper lvm--raid0-lvm0                       7 2T  6 6T  232G  97   export lvm-raid0    truncate -s 500M 500MB file      AFTER   df -h   grep lvm  dev mapper lvm--raid0-lvm0                       7 2T  6 6T  232G  97   export lvm-raid0  So  when doing this  you will be deferring physical allocation until the file is accessed  If you re mapping this file to memory  you may not have the expected performance  But this is still a useful command to know  For e g  when benchmarking transfers using files  the specified size of the file will still get moved    rsync -aHAxvP --numeric-ids --delete --info progress2          root mulder bub lan  export lvm-raid0 500MB file           export raid1  receiving incremental file list 500MB file     524 288 000 100    41 40MB s    0 00 12  xfr 1  to-chk 0 1   sent 30 bytes  received 524 352 082 bytes  38 840 897 19 bytes sec total size is 524 288 000  speedup is 1 00

User · Answer

I don t think you re going to get much faster than dd  The bottleneck is the disk  writing hundreds of GB of data to it is going to take a long time no matter how you do it   But here s a possibility that might work for your application  If you don t care about the contents of the file  how about creating a  virtual  file whose contents are the dynamic output of a program  Instead of open  ing the file  use popen   to open a pipe to an external program  The external program generates data whenever it s needed  Once the pipe is open  it acts just like a regular file in that the program that opened the pipe can fseek    rewind    etc  You ll need to use pclose   instead of close   when you re done with the pipe   If your application needs the file to be a certain size  it will be up to the external program to keep track of where in the  file  it is and send an eof when the  end  has been reached

User · Answer

To make a 1 GB file   dd if  dev zero of filename bs 1G count 1

User · Answer

Where seek is the size of the file you want in bytes - 1   dd if  dev zero of filename bs 1 count 1 seek 1048575

User · Answer

This is the fastest I could do  which is not fast  with the following constraints    The goal of the large file is to fill a disk  so can t be compressible  Using ext3 filesystem   fallocate not available    This is the gist of it        include stdlib h  stdio h  and stdint h int32 t buf 256      Block size  for  int i   0  i  lt  256    i        buf i    rand       random to be non-compressible    FILE  file   fopen   file on your system    wb    int blocksToWrite   1024   1024     1 GB for  int i   0  i  lt  blocksToWrite    i       fwrite buf  sizeof int32 t   256  file       In our case this is for an embedded linux system and this works well enough  but would prefer something faster   FYI the command dd if  dev urandom of outputfile bs 1024 count   XX was so slow as to be unusable

User · Answer

One approach  if you can guarantee unrelated applications won t use the files in a conflicting manner  just create a pool of files of varying sizes in a specific directory  then create links to them when needed   For example  have a pool of files called     home bigfiles 512M-A  home bigfiles 512M-B  home bigfiles 1024M-A  home bigfiles 1024M-B   Then  if you have an application that needs a 1G file called  home oracle logfile  execute a  ln  home bigfiles 1024M-A  home oracle logfile    If it s on a separate filesystem  you will have to use a symbolic link   The A B etc files can be used to ensure there s no conflicting use between unrelated applications   The link operation is about as fast as you can get

User · Answer

truncate -s 10M output file  will create a 10 M file instantaneously  M stands for 10241024 bytes  MB stands for 10001000 - same with K  KB  G  GB     EDIT  as many have pointed out  this will not physically allocate the file on your device  With this you could actually create an arbitrary large file  regardless of the available space on the device  as it creates a  quot sparse quot  file  For e g  notice no HDD space is consumed with this command      BEFORE   df -h   grep lvm  dev mapper lvm--raid0-lvm0                       7 2T  6 6T  232G  97   export lvm-raid0    truncate -s 500M 500MB file      AFTER   df -h   grep lvm  dev mapper lvm--raid0-lvm0                       7 2T  6 6T  232G  97   export lvm-raid0  So  when doing this  you will be deferring physical allocation until the file is accessed  If you re mapping this file to memory  you may not have the expected performance  But this is still a useful command to know  For e g  when benchmarking transfers using files  the specified size of the file will still get moved    rsync -aHAxvP --numeric-ids --delete --info progress2          root mulder bub lan  export lvm-raid0 500MB file           export raid1  receiving incremental file list 500MB file     524 288 000 100    41 40MB s    0 00 12  xfr 1  to-chk 0 1   sent 30 bytes  received 524 352 082 bytes  38 840 897 19 bytes sec total size is 524 288 000  speedup is 1 00

User · Answer

Linux  amp  all filesystems  xfs mkfile 10240m 10Gigfile  Linux  amp  and some filesystems  ext4  xfs  btrfs and ocfs2   fallocate -l 10G 10Gigfile  OS X  Solaris  SunOS and probably other UNIXes  mkfile 10240m 10Gigfile  HP-UX  prealloc 10Gigfile 10737418240  Explanation  Try mkfile  lt size gt  myfile as an alternative of dd  With the -n option the size is noted  but  disk blocks aren t allocated until data is written to them   Without the -n option  the space is zero-filled  which means writing to the disk  which means taking time    mkfile is derived from SunOS and is not available everywhere  Most Linux systems have xfs mkfile which works exactly the same way  and not just on XFS file systems despite the name  It s included in xfsprogs  for Debian Ubuntu  or similar named packages   Most Linux systems also have fallocate  which only works on certain file systems  such as btrfs  ext4  ocfs2  and xfs   but is the fastest  as it allocates all the file space  creates non-holey files  but does not initialize any of it

User · Answer

One approach  if you can guarantee unrelated applications won t use the files in a conflicting manner  just create a pool of files of varying sizes in a specific directory  then create links to them when needed   For example  have a pool of files called     home bigfiles 512M-A  home bigfiles 512M-B  home bigfiles 1024M-A  home bigfiles 1024M-B   Then  if you have an application that needs a 1G file called  home oracle logfile  execute a  ln  home bigfiles 1024M-A  home oracle logfile    If it s on a separate filesystem  you will have to use a symbolic link   The A B etc files can be used to ensure there s no conflicting use between unrelated applications   The link operation is about as fast as you can get

User · Answer

dd from the other answers is a good solution  but it is slow for this purpose  In Linux  and other POSIX systems   we have fallocate  which uses the desired space without having to actually writing to it  works with most modern disk based file systems  very fast   For example   fallocate -l 10G gentoo root img

User · Answer

Shameless plug  OTFFS provides a file system providing arbitrarily large  well  almost   Exabytes is the current limit  files of generated content   It is Linux-only  plain C  and in early alpha   See https   github com s5k6 otffs

User · Answer

I don t know a whole lot about Linux  but here s the C Code I wrote to fake huge files on DC Share many years ago    include  lt  stdio h  gt   include  lt  stdlib h  gt   int main         int i      FILE  fp       fp fopen  bigfakefile txt   w         for i 0 i lt  1024 1024  i              fseek fp  1024 1024  SEEK CUR           fprintf fp  C

User · Answer

One approach  if you can guarantee unrelated applications won t use the files in a conflicting manner  just create a pool of files of varying sizes in a specific directory  then create links to them when needed   For example  have a pool of files called     home bigfiles 512M-A  home bigfiles 512M-B  home bigfiles 1024M-A  home bigfiles 1024M-B   Then  if you have an application that needs a 1G file called  home oracle logfile  execute a  ln  home bigfiles 1024M-A  home oracle logfile    If it s on a separate filesystem  you will have to use a symbolic link   The A B etc files can be used to ensure there s no conflicting use between unrelated applications   The link operation is about as fast as you can get

User · Answer

Shameless plug  OTFFS provides a file system providing arbitrarily large  well  almost   Exabytes is the current limit  files of generated content   It is Linux-only  plain C  and in early alpha   See https   github com s5k6 otffs

User · Answer

I don t know a whole lot about Linux  but here s the C Code I wrote to fake huge files on DC Share many years ago    include  lt  stdio h  gt   include  lt  stdlib h  gt   int main         int i      FILE  fp       fp fopen  bigfakefile txt   w         for i 0 i lt  1024 1024  i              fseek fp  1024 1024  SEEK CUR           fprintf fp  C

User · Answer

Where seek is the size of the file you want in bytes - 1   dd if  dev zero of filename bs 1 count 1 seek 1048575

User · Answer

Where seek is the size of the file you want in bytes - 1   dd if  dev zero of filename bs 1 count 1 seek 1048575

User · Answer

This is a common question -- especially in today s environment of virtual environments  Unfortunately  the answer is not as straight-forward as one might assume   dd is the obvious first choice  but dd is essentially a copy and that forces you to write every block of data  thus  initializing the file contents     And that initialization is what takes up so much I O time   Want to make it take even longer  Use  dev random instead of  dev zero  Then you ll use CPU as well as I O time   In the end though  dd is a poor choice  though essentially the default used by the VM  create  GUIs   E g   dd if  dev zero of   gentoo root img bs 4k iflag fullblock count bytes count 10G   truncate is another choice -- and is likely the fastest    But that is because it creates a  sparse file   Essentially  a sparse file is a section of disk that has a lot of the same data  and the underlying filesystem  cheats  by not really storing all of the data  but just  pretending  that it s all there  Thus  when you use truncate to create a 20 nbsp GB drive for your VM  the filesystem doesn t actually allocate 20 nbsp GB  but it cheats and says that there are 20 nbsp GB of zeros there  even though as little as one track on the disk may actually  really  be in use  E g     truncate -s 10G gentoo root img   fallocate is the final -- and best -- choice for use with VM disk allocation  because it essentially  reserves   or  allocates  all of the space you re seeking  but it doesn t bother to write anything  So  when you use fallocate to create a 20 nbsp GB virtual drive space  you really do get a 20 nbsp GB file  not a  sparse file   and you won t have bothered to write anything to it -- which means virtually anything could be in there -- kind of like a brand new disk   E g    fallocate -l 10G gentoo root img

User · Answer

You could use https   github com flew-software trash-dump you can create file that is any size and with random data heres a command you can run after installing trash-dump  creates a 1GB file    trash-dump --filename  quot huge quot  --seed 1232 --noBytes 1000000000  BTW I created it

User · Answer

You could use https   github com flew-software trash-dump you can create file that is any size and with random data heres a command you can run after installing trash-dump  creates a 1GB file    trash-dump --filename  quot huge quot  --seed 1232 --noBytes 1000000000  BTW I created it

User · Answer

Linux  amp  all filesystems  xfs mkfile 10240m 10Gigfile  Linux  amp  and some filesystems  ext4  xfs  btrfs and ocfs2   fallocate -l 10G 10Gigfile  OS X  Solaris  SunOS and probably other UNIXes  mkfile 10240m 10Gigfile  HP-UX  prealloc 10Gigfile 10737418240  Explanation  Try mkfile  lt size gt  myfile as an alternative of dd  With the -n option the size is noted  but  disk blocks aren t allocated until data is written to them   Without the -n option  the space is zero-filled  which means writing to the disk  which means taking time    mkfile is derived from SunOS and is not available everywhere  Most Linux systems have xfs mkfile which works exactly the same way  and not just on XFS file systems despite the name  It s included in xfsprogs  for Debian Ubuntu  or similar named packages   Most Linux systems also have fallocate  which only works on certain file systems  such as btrfs  ext4  ocfs2  and xfs   but is the fastest  as it allocates all the file space  creates non-holey files  but does not initialize any of it

User · Answer

You can use  yes  command also  The syntax is fairly simple    yes  gt  gt  myfile   Press  Ctrl   C  to stop this  else it will eat up all your space available   To clean this file run     gt myfile   will clean this file

User · Answer

Linux  amp  all filesystems  xfs mkfile 10240m 10Gigfile  Linux  amp  and some filesystems  ext4  xfs  btrfs and ocfs2   fallocate -l 10G 10Gigfile  OS X  Solaris  SunOS and probably other UNIXes  mkfile 10240m 10Gigfile  HP-UX  prealloc 10Gigfile 10737418240  Explanation  Try mkfile  lt size gt  myfile as an alternative of dd  With the -n option the size is noted  but  disk blocks aren t allocated until data is written to them   Without the -n option  the space is zero-filled  which means writing to the disk  which means taking time    mkfile is derived from SunOS and is not available everywhere  Most Linux systems have xfs mkfile which works exactly the same way  and not just on XFS file systems despite the name  It s included in xfsprogs  for Debian Ubuntu  or similar named packages   Most Linux systems also have fallocate  which only works on certain file systems  such as btrfs  ext4  ocfs2  and xfs   but is the fastest  as it allocates all the file space  creates non-holey files  but does not initialize any of it

User · Answer

Examples where seek is the size of the file you want in bytes   kilobytes dd if  dev zero of filename bs 1 count 0 seek 200K   megabytes dd if  dev zero of filename bs 1 count 0 seek 200M   gigabytes dd if  dev zero of filename bs 1 count 0 seek 200G   terabytes dd if  dev zero of filename bs 1 count 0 seek 200T     From the dd manpage      BLOCKS  and  BYTES may be followed by the following multiplicative suffixes  c 1  w 2  b 512  kB 1000  K 1024  MB 1000 1000  M 1024 1024  GB  1000 1000 1000  G 1024 1024 1024  and so on for T  P  E  Z  Y

User · Answer

One approach  if you can guarantee unrelated applications won t use the files in a conflicting manner  just create a pool of files of varying sizes in a specific directory  then create links to them when needed   For example  have a pool of files called     home bigfiles 512M-A  home bigfiles 512M-B  home bigfiles 1024M-A  home bigfiles 1024M-B   Then  if you have an application that needs a 1G file called  home oracle logfile  execute a  ln  home bigfiles 1024M-A  home oracle logfile    If it s on a separate filesystem  you will have to use a symbolic link   The A B etc files can be used to ensure there s no conflicting use between unrelated applications   The link operation is about as fast as you can get

User · Answer

Examples where seek is the size of the file you want in bytes   kilobytes dd if  dev zero of filename bs 1 count 0 seek 200K   megabytes dd if  dev zero of filename bs 1 count 0 seek 200M   gigabytes dd if  dev zero of filename bs 1 count 0 seek 200G   terabytes dd if  dev zero of filename bs 1 count 0 seek 200T     From the dd manpage      BLOCKS  and  BYTES may be followed by the following multiplicative suffixes  c 1  w 2  b 512  kB 1000  K 1024  MB 1000 1000  M 1024 1024  GB  1000 1000 1000  G 1024 1024 1024  and so on for T  P  E  Z  Y

User · Answer

This is a common question -- especially in today s environment of virtual environments  Unfortunately  the answer is not as straight-forward as one might assume   dd is the obvious first choice  but dd is essentially a copy and that forces you to write every block of data  thus  initializing the file contents     And that initialization is what takes up so much I O time   Want to make it take even longer  Use  dev random instead of  dev zero  Then you ll use CPU as well as I O time   In the end though  dd is a poor choice  though essentially the default used by the VM  create  GUIs   E g   dd if  dev zero of   gentoo root img bs 4k iflag fullblock count bytes count 10G   truncate is another choice -- and is likely the fastest    But that is because it creates a  sparse file   Essentially  a sparse file is a section of disk that has a lot of the same data  and the underlying filesystem  cheats  by not really storing all of the data  but just  pretending  that it s all there  Thus  when you use truncate to create a 20 nbsp GB drive for your VM  the filesystem doesn t actually allocate 20 nbsp GB  but it cheats and says that there are 20 nbsp GB of zeros there  even though as little as one track on the disk may actually  really  be in use  E g     truncate -s 10G gentoo root img   fallocate is the final -- and best -- choice for use with VM disk allocation  because it essentially  reserves   or  allocates  all of the space you re seeking  but it doesn t bother to write anything  So  when you use fallocate to create a 20 nbsp GB virtual drive space  you really do get a 20 nbsp GB file  not a  sparse file   and you won t have bothered to write anything to it -- which means virtually anything could be in there -- kind of like a brand new disk   E g    fallocate -l 10G gentoo root img

User · Answer

I don t think you re going to get much faster than dd  The bottleneck is the disk  writing hundreds of GB of data to it is going to take a long time no matter how you do it   But here s a possibility that might work for your application  If you don t care about the contents of the file  how about creating a  virtual  file whose contents are the dynamic output of a program  Instead of open  ing the file  use popen   to open a pipe to an external program  The external program generates data whenever it s needed  Once the pipe is open  it acts just like a regular file in that the program that opened the pipe can fseek    rewind    etc  You ll need to use pclose   instead of close   when you re done with the pipe   If your application needs the file to be a certain size  it will be up to the external program to keep track of where in the  file  it is and send an eof when the  end  has been reached

User · Answer

I don t think you re going to get much faster than dd  The bottleneck is the disk  writing hundreds of GB of data to it is going to take a long time no matter how you do it   But here s a possibility that might work for your application  If you don t care about the contents of the file  how about creating a  virtual  file whose contents are the dynamic output of a program  Instead of open  ing the file  use popen   to open a pipe to an external program  The external program generates data whenever it s needed  Once the pipe is open  it acts just like a regular file in that the program that opened the pipe can fseek    rewind    etc  You ll need to use pclose   instead of close   when you re done with the pipe   If your application needs the file to be a certain size  it will be up to the external program to keep track of where in the  file  it is and send an eof when the  end  has been reached

User · Answer

You can use  yes  command also  The syntax is fairly simple    yes  gt  gt  myfile   Press  Ctrl   C  to stop this  else it will eat up all your space available   To clean this file run     gt myfile   will clean this file

User · Answer

I don t think you re going to get much faster than dd  The bottleneck is the disk  writing hundreds of GB of data to it is going to take a long time no matter how you do it   But here s a possibility that might work for your application  If you don t care about the contents of the file  how about creating a  virtual  file whose contents are the dynamic output of a program  Instead of open  ing the file  use popen   to open a pipe to an external program  The external program generates data whenever it s needed  Once the pipe is open  it acts just like a regular file in that the program that opened the pipe can fseek    rewind    etc  You ll need to use pclose   instead of close   when you re done with the pipe   If your application needs the file to be a certain size  it will be up to the external program to keep track of where in the  file  it is and send an eof when the  end  has been reached

[linux] Quickly create a large file on a Linux system

Examples related to linux

Examples related to file

Examples related to filesystems