[testing] How to create large PDF files (10MB, 50MB, 100MB, 200MB, 500MB, 1GB, etc.) for testing purposes?

I tried this for ((i=1; i<=10; i++)); do convert 100MB.pdf 10MB.pdf 100MB.pdf; done to create 100MB file but very quickly run out of RAM.

Any ideas?

This question is related to testing pdf qa

The answer is


One possibility is, if you are familiar with PDF format:

  1. Create some simply PDF with one page (Page should be contained within one object)
  2. Copy object multiply times
  3. Add references to the copied objects to the page catalog
  4. Fix xref table

You get an valid document of any size, entire file will be processed by a reader.


Under Linux there is pdfunite (part of poppler) that can concatenate the same pdf files to get one large pdf file:

pdfunite in.pdf in.pdf in.pdf out.pdf

see manpage


Partly it depends on what you are trying to increase the size of... number of pages, number of images, size of a single image. In my experience, the vast bulk (90%+) of any given 'large' PDF file will be the images.

You could try using a pro product like Adobe InDesign to quickly build a large project and export it as a PDF.

Adobe Acrobat Pro has built-in tools to optimize PDF files -- you try using the tools to 'un-optimize' your file. :)


Windows: fsutil

Usage:

fsutil file createnew [filename].[extension] [# of bytes]

Source: https://www.windows-commandline.com/how-to-create-large-dummy-file/


Linux: fallocate

Usage:

fallocate -l 10G [filename].[extension]

Source: Quickly create a large file on a Linux system?


If you want to generate a file in the Windows, then, please follow the below steps:

  1. Go to a directory where you want to save the generated file
  2. Open the command prompt on that directory
  3. Run this fsutil file createnew [filename].[extension] [# of bytes] command. For example fsutil file createnew test.pdf 999999999
  4. 95.3 MB File will be generated

Update: The generated file will not be a valid pdf file. It just holds the given size.


Have you tried using cat to combine the files?

cat 10MB.pdf 10MB.pdf > 20MB.pdf

That should result in a 20MB file.


If you want a really big valid PDF file, then

  1. take all the biggest valid pdf you can
  2. With a tool like PDF24Creator make a fusion of pdfs

It works for me to create a big file (140MB) after some minutes.


For those using macOS mkfile might be a good alternative to fallocate or dd

mkfile 100m some100mfile.pdf

reference - https://stackoverflow.com/a/33478049/711401


according to http://www.maketecheasier.com/combine-multiple-pdf-files-with-pdftk/ the command should be

pdftk file1.pdf file2.pdf file3.pdf cat output newfile.pdf

note that you should download windows version of pdftk


I had problems using pdftk with the cat parameter had a better success with output.

The following command worked for me:

pdftk file_1.pdf file_1.pdf file_1.pdf file_1.pdf cat output.pdf

Using cat produced the following error:

Error: Unexpected text in page range end, here: 
    output.pdf
    Exiting.
    Acceptable keywords, for example: "even" or "odd".
    To rotate pages, use: "north" "south" "east"
        "west" "left" "right" or "down"
Errors encountered.  No output created.
Done.  Input errors, so no output created.

http://www.pdflabs.com/docs/pdftk-cli-examples/.

I created a 172mb PDF is no time at all.


The most simple tool: use pdftk (or pdftk.exe, if you are on Windows):

pdftk 10_MB.pdf 100_MB.pdf cat output 110_MB.pdf

This will be a valid PDF. Download pdftk here.

Update: if you want really large (and valid!), non-optimized PDFs, use this command:

pdftk 100MB.pdf 100MB.pdf 100MB.pdf 100MB.pdf 100MB.pdf cat output 500_MB.pdf

or even (if you are on Linux, Unix or Mac OS X):

pdftk $(for i in $(seq 1 100); do echo -n "100MB.pdf "; done) cat output 10_GB.pdf