[node.js] HTML to PDF with Node.js

I'm looking to create a printable pdf version of my website webpages. Something like express.render() only render the page as pdf

Does anyone know a node module that does that ?

If not, how would you go about implementing one ? I've seen some methods talk about using headless browser like phantom.js, but not sure whats the flow.

This question is related to node.js express pdf-generation

The answer is


You can also use pdf node creator package

Package URL - https://www.npmjs.com/package/pdf-creator-node


For those who don't want to install PhantomJS along with an instance of Chrome/Firefox on their server - or because the PhantomJS project is currently suspended, here's an alternative.

You can externalize the conversions to APIs to do the job. Many exists and varies but what you'll get is a reliable service with up-to-date features (I'm thinking CSS3, Web fonts, SVG, Canvas compatible).

For instance, with PDFShift (disclaimer, I'm the founder), you can do this simply by using the request package:

const request = require('request')
request.post(
    'https://api.pdfshift.io/v2/convert/',
    {
        'auth': {'user': 'your_api_key'},
        'json': {'source': 'https://www.google.com'},
        'encoding': null
    },
    (error, response, body) => {
        if (response === undefined) {
            return reject({'message': 'Invalid response from the server.', 'code': 0, 'response': response})
        }
        if (response.statusCode == 200) {
            // Do what you want with `body`, that contains the binary PDF
            // Like returning it to the client - or saving it as a file locally or on AWS S3
            return True
        }

        // Handle any errors that might have occured
    }
);

In addition to @Jozzhart Answer, you can make a local html; serve it with express; and use phantom to make PDF from it; something like this:

const exp = require('express');
const app = exp();
const pth = require("path");
const phantom = require('phantom');
const ip = require("ip");

const PORT = 3000;
const PDF_SOURCE = "index"; //index.html
const PDF_OUTPUT = "out"; //out.pdf

const source = pth.join(__dirname, "", `${PDF_SOURCE}.html`);
const output = pth.join(__dirname, "", `${PDF_OUTPUT}.pdf`);

app.use("/" + PDF_SOURCE, exp.static(source));
app.use("/" + PDF_OUTPUT, exp.static(output));

app.listen(PORT);

let makePDF = async (fn) => {
    let local = `http://${ip.address()}:${PORT}/${PDF_SOURCE}`;
    phantom.create().then((ph) => {
        ph.createPage().then((page) => {
            page.open(local).then(() =>
                page.render(output).then(() => { ph.exit(); fn() })
            );
        });
    });
}

makePDF(() => {
    console.log("PDF Created From Local File");
    console.log("PDF is downloadable from link:");
    console.log(`http://${ip.address()}:${PORT}/${PDF_OUTPUT}`);
});

and index.html can be anything:

<h1>PDF HEAD</h1>
<a href="#">LINK</a>

result:

enter image description here


Create PDF from External URL

Here's an adaptation of the previous answers which utilizes html-pdf, but also combines it with requestify so it works with an external URL:

Install your dependencies

npm i -S html-pdf requestify

Then, create the script:

//MakePDF.js

var pdf = require('html-pdf');
var requestify = require('requestify');
var externalURL= 'http://www.google.com';

requestify.get(externalURL).then(function (response) {
   // Get the raw HTML response body
   var html = response.body; 
   var config = {format: 'A4'}; // or format: 'letter' - see https://github.com/marcbachmann/node-html-pdf#options

// Create the PDF
   pdf.create(html, config).toFile('pathtooutput/generated.pdf', function (err, res) {
      if (err) return console.log(err);
      console.log(res); // { filename: '/pathtooutput/generated.pdf' }
   });
});

Then you just run from the command line:

node MakePDF.js

Watch your beautify pixel perfect PDF be created for you (for free!)


Phantom.js is an headless webkit server and it will load any web page and render it in memory, although you might not be able to see it, there is a Screen Capture feature, in which you can export the current view as PNG, PDF, JPEG and GIF. Have a look at this example from phantom.js documentation


In case you arrive here looking for a way to make PDF from view templates in Express, a colleague and I made express-template-to-pdf

which allows you to generate PDF from whatever templates you're using in Express - Pug, Nunjucks, whatever.

It depends on html-pdf and is written to use in your routes just like you use res.render:

const pdfRenderer = require('@ministryofjustice/express-template-to-pdf')

app.set('views', path.join(__dirname, 'views'))
app.set('view engine', 'pug')

app.use(pdfRenderer())

If you've used res.render then using it should look obvious:

app.use('/pdf', (req, res) => {
    res.renderPDF('helloWorld', { message: 'Hello World!' });
})

You can pass options through to html-pdf to control the PDF document page size etc

Merely building on the excellent work of others.


Use html-pdf

var fs = require('fs');
var pdf = require('html-pdf');
var html = fs.readFileSync('./test/businesscard.html', 'utf8');
var options = { format: 'Letter' };

pdf.create(html, options).toFile('./businesscard.pdf', function(err, res) {
  if (err) return console.log(err);
  console.log(res); // { filename: '/app/businesscard.pdf' } 
});

Package

I used html-pdf

Easy to use and allows not only to save pdf as file, but also pipe pdf content to a WriteStream (so I could stream it directly to Google Storage to save there my reports).

Using css + images

It takes css into account. The only problem I faced - it ignored my images. The solution I found was to replace url in src attrribute value by base64, e.g.

<img src="data:image/png;base64,iVBOR...kSuQmCC">

You can do it with your code or to use one of online converters, e.g. https://www.base64-image.de/

Compile valid html code from html fragment + css

  1. I had to get a fragment of my html document (I just appiled .html() method on jQuery selector).
  2. Then I've read the content of the relevant css file.

Using this two values (stored in variables html and css accordingly) I've compiled a valid html code using Template string

var htmlContent = `
<!DOCTYPE html>
<html>
  <head>
    <style>
      ${css}
    </style>
  </head>
  <body id=direct-sellers-bill>
    ${html}
  </body>
</html>`

and passed it to create method of html-pdf.


Try to use Puppeteer to create PDF from HTML

Example from here https://github.com/chuongtrh/html_to_pdf

Or https://github.com/GoogleChrome/puppeteer


In my view, the best way to do this is via an API so that you do not add a large and complex dependency into your app that runs unmanaged code, that needs to be frequently updated.

Here is a simple way to do this, which is free for 800 requests/month:

var CloudmersiveConvertApiClient = require('cloudmersive-convert-api-client');
var defaultClient = CloudmersiveConvertApiClient.ApiClient.instance;

// Configure API key authorization: Apikey
var Apikey = defaultClient.authentications['Apikey'];
Apikey.apiKey = 'YOUR API KEY';



var apiInstance = new CloudmersiveConvertApiClient.ConvertWebApi();

var input = new CloudmersiveConvertApiClient.HtmlToPdfRequest(); // HtmlToPdfRequest | HTML to PDF request parameters
input.Html = "<b>Hello, world!</b>";


var callback = function(error, data, response) {
  if (error) {
    console.error(error);
  } else {
    console.log('API called successfully. Returned data: ' + data);
  }
};
apiInstance.convertWebHtmlToPdf(input, callback);

With the above approach you can also install the API on-premises or on your own infrastructure if you prefer.


The best solution I found is html-pdf. It's simple and work with big html.

https://www.npmjs.com/package/html-pdf

Its as simple as that:

    pdf.create(htm, options).toFile('./pdfname.pdf', function(err, res) {
        if (err) {
          console.log(err);
        }
    });

If you want to export HTML to PDF. You have many options. without node even

Option 1: Have a button on your html page that calls window.print() function. use the browsers native html to pdf. use media queries to make your html page look good on a pdf. and you also have the print before and after events that you can use to make changes to your page before print.

Option 2. htmltocanvas or rasterizeHTML. convert your html to canvas , then call toDataURL() on the canvas object to get the image . and use a JavaScript library like jsPDF to add that image to a PDF file. Disadvantage of this approach is that the pdf doesnt become editable. If you want data extracted from PDF, there is different ways for that.

Option 3. @Jozzhard answer


Examples related to node.js

Hide Signs that Meteor.js was Used Querying date field in MongoDB with Mongoose SyntaxError: Cannot use import statement outside a module Server Discovery And Monitoring engine is deprecated How to fix ReferenceError: primordials is not defined in node UnhandledPromiseRejectionWarning: This error originated either by throwing inside of an async function without a catch block dyld: Library not loaded: /usr/local/opt/icu4c/lib/libicui18n.62.dylib error running php after installing node with brew on Mac internal/modules/cjs/loader.js:582 throw err DeprecationWarning: Buffer() is deprecated due to security and usability issues when I move my script to another server Please run `npm cache clean`

Examples related to express

UnhandledPromiseRejectionWarning: This error originated either by throwing inside of an async function without a catch block jwt check if token expired Avoid "current URL string parser is deprecated" warning by setting useNewUrlParser to true MongoNetworkError: failed to connect to server [localhost:27017] on first connect [MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017] npm notice created a lockfile as package-lock.json. You should commit this file Make Axios send cookies in its requests automatically What does body-parser do with express? SyntaxError: Unexpected token function - Async Await Nodejs Route.get() requires callback functions but got a "object Undefined" How to redirect to another page in node.js

Examples related to pdf-generation

How to convert HTML to PDF using iTextSharp Convert canvas to PDF HTML to PDF with Node.js Save multiple sheets to .pdf how to save DOMPDF generated content to file? Python PDF library Convert Word doc, docx and Excel xls, xlsx to PDF with PHP ITextSharp insert text to an existing pdf What are the minimum margins most printers can handle? Best C# API to create PDF