[testing] How to create large PDF files (10MB, 50MB, 100MB, 200MB, 500MB, 1GB, etc.) for testing purposes?

I tried this for ((i=1; i<=10; i++)); do convert 100MB.pdf 10MB.pdf 100MB.pdf; done to create 100MB file but very quickly run out of RAM.

Any ideas?

This question is related to testing pdf qa

The answer is


The most simple tool: use pdftk (or pdftk.exe, if you are on Windows):

pdftk 10_MB.pdf 100_MB.pdf cat output 110_MB.pdf

This will be a valid PDF. Download pdftk here.

Update: if you want really large (and valid!), non-optimized PDFs, use this command:

pdftk 100MB.pdf 100MB.pdf 100MB.pdf 100MB.pdf 100MB.pdf cat output 500_MB.pdf

or even (if you are on Linux, Unix or Mac OS X):

pdftk $(for i in $(seq 1 100); do echo -n "100MB.pdf "; done) cat output 10_GB.pdf

Windows: fsutil

Usage:

fsutil file createnew [filename].[extension] [# of bytes]

Source: https://www.windows-commandline.com/how-to-create-large-dummy-file/


Linux: fallocate

Usage:

fallocate -l 10G [filename].[extension]

Source: Quickly create a large file on a Linux system?


For those using macOS mkfile might be a good alternative to fallocate or dd

mkfile 100m some100mfile.pdf

reference - https://stackoverflow.com/a/33478049/711401


I had problems using pdftk with the cat parameter had a better success with output.

The following command worked for me:

pdftk file_1.pdf file_1.pdf file_1.pdf file_1.pdf cat output.pdf

Using cat produced the following error:

Error: Unexpected text in page range end, here: 
    output.pdf
    Exiting.
    Acceptable keywords, for example: "even" or "odd".
    To rotate pages, use: "north" "south" "east"
        "west" "left" "right" or "down"
Errors encountered.  No output created.
Done.  Input errors, so no output created.

http://www.pdflabs.com/docs/pdftk-cli-examples/.

I created a 172mb PDF is no time at all.


according to http://www.maketecheasier.com/combine-multiple-pdf-files-with-pdftk/ the command should be

pdftk file1.pdf file2.pdf file3.pdf cat output newfile.pdf

note that you should download windows version of pdftk


If you want to generate a file in the Windows, then, please follow the below steps:

  1. Go to a directory where you want to save the generated file
  2. Open the command prompt on that directory
  3. Run this fsutil file createnew [filename].[extension] [# of bytes] command. For example fsutil file createnew test.pdf 999999999
  4. 95.3 MB File will be generated

Update: The generated file will not be a valid pdf file. It just holds the given size.


If you want a really big valid PDF file, then

  1. take all the biggest valid pdf you can
  2. With a tool like PDF24Creator make a fusion of pdfs

It works for me to create a big file (140MB) after some minutes.


Partly it depends on what you are trying to increase the size of... number of pages, number of images, size of a single image. In my experience, the vast bulk (90%+) of any given 'large' PDF file will be the images.

You could try using a pro product like Adobe InDesign to quickly build a large project and export it as a PDF.

Adobe Acrobat Pro has built-in tools to optimize PDF files -- you try using the tools to 'un-optimize' your file. :)


One possibility is, if you are familiar with PDF format:

  1. Create some simply PDF with one page (Page should be contained within one object)
  2. Copy object multiply times
  3. Add references to the copied objects to the page catalog
  4. Fix xref table

You get an valid document of any size, entire file will be processed by a reader.


Under Linux there is pdfunite (part of poppler) that can concatenate the same pdf files to get one large pdf file:

pdfunite in.pdf in.pdf in.pdf out.pdf

see manpage


Have you tried using cat to combine the files?

cat 10MB.pdf 10MB.pdf > 20MB.pdf

That should result in a 20MB file.


Questions with testing tag:

Test process.env with Jest How to configure "Shorten command line" method for whole project in IntelliJ Jest spyOn function called Simulate a button click in Jest Mockito - NullpointerException when stubbing Method toBe(true) vs toBeTruthy() vs toBeTrue() How-to turn off all SSL checks for postman for a specific site What is the difference between smoke testing and sanity testing? ReferenceError: describe is not defined NodeJs How to properly assert that an exception gets raised in pytest? How do you print in a Go test using the "testing" package? How do I install jmeter on a Mac? How to run only one unit test class using Gradle Select a date from date picker using Selenium webdriver Exception in thread "main" java.lang.Error: Unresolved compilation problems How to identify and switch to the frame in selenium webdriver when frame does not have id How can I solve the error LNK2019: unresolved external symbol - function? How to select option in drop down protractorjs e2e tests Selenium and xpath: finding a div with a class/id and verifying text inside Integration Testing POSTing an entire object to Spring MVC controller How to type in textbox using Selenium WebDriver (Selenium 2) with Java? How can I test that a variable is more than eight characters in PowerShell? Spring Test & Security: How to mock authentication? What is the difference between mocking and spying when using Mockito? iOS Simulator to test website on Mac Switch tabs using Selenium WebDriver with Java How to check if a string array contains one string in JavaScript? Automated testing for REST Api How to check whether dynamically attached event listener exists or not? ScalaTest in sbt: is there a way to run a single test without tags? How to test an SQL Update statement before running it? Load vs. Stress testing Verify a method call using Moq Selenium: Can I set any of the attribute value of a WebElement in Selenium? Is there any publicly accessible JSON data source to test with real world data? How to test code dependent on environment variables using JUnit? How to mock private method for testing using PowerMock? WebDriver: check if an element exists? How to test my servlet using JUnit How to create unit tests easily in eclipse How to create large PDF files (10MB, 50MB, 100MB, 200MB, 500MB, 1GB, etc.) for testing purposes? What's the difference between unit, functional, acceptance, and integration tests? Trying to mock datetime.date.today(), but not working How to write a test which expects an Error to be thrown in Jasmine? Gradle: How to Display Test Results in the Console in Real Time? Can I run multiple versions of Google Chrome on the same machine? (Mac or Windows) What's the difference between a mock & stub? Writing unit tests in Python: How do I start? Difference between acceptance test and functional test? jquery (or pure js) simulate enter key pressed for testing

Questions with pdf tag:

ImageMagick security policy 'PDF' blocking conversion How to extract table as text from the PDF using Python? Extract a page from a pdf as a jpeg How can I read pdf in python? Generating a PDF file from React Components Extract Data from PDF and Add to Worksheet How to extract text from a PDF file? How to download PDF automatically using js? Download pdf file using jquery ajax Generate PDF from HTML using pdfMake in Angularjs Generate PDF from Swagger API documentation IPython/Jupyter Problems saving notebook as PDF Extract / Identify Tables from PDF python Create PDF from a list of images VBA Print to PDF and Save with Automatic File Name android download pdf from url then open it with a pdf reader How to convert PDF files to images How to convert webpage into PDF by using Python Window.Open with PDF stream instead of PDF location PDF Blob - Pop up window not showing content How to Display blob (.pdf) in an AngularJS app Excel VBA to Export Selected Sheets to PDF Zoom to fit: PDF Embedded in HTML correct PHP headers for pdf file download Codeigniter how to create PDF HTML embedded PDF iframe Open a selected file (image, pdf, ...) programmatically from my Android Application? Open a PDF using VBA in Excel Printing PDFs from Windows Command Line How to get raw text from pdf file using java How to display pdf in php How to display PDF file in HTML? Android open pdf file How to read pdf file and write it to outputStream How to embed PDF file with responsive width Print PDF directly from JavaScript How to embed a PDF? Save multiple sheets to .pdf How to embed a PDF viewer in a page? Duplicate headers received from server How to display a pdf in a modal window? How to open a PDF file in an <iframe>? How to build PDF file from binary string returned from a web-service using javascript PHP mPDF save file as PDF Pdf.js: rendering a pdf file using a base64 file source instead of url Upload DOC or PDF using PHP Save base64 string as PDF at client side with JavaScript Convert PDF to clean SVG? Display PDF file inside my android application How to Use pdf.js

Questions with qa tag:

How-to turn off all SSL checks for postman for a specific site How to create large PDF files (10MB, 50MB, 100MB, 200MB, 500MB, 1GB, etc.) for testing purposes? What tools do you use to test your public REST API? Should black box or white box testing be the emphasis for testers? Best way to stress test a website