Add text to Existing PDF using Python

Question

I need to add some extra text to an existing PDF using Python  what is the best way to go about this and what extra modules will I need to install   Note  Ideally I would like to be able to run this on both Windows and Linux  but at a push Linux only will do   Edit  pyPDF and ReportLab look good but neither one will allow me to edit an existing PDF  are there any other options

User · Answer

If you re on Windows  this might work   PDF Creator Pilot  There s also a whitepaper of a PDF creation and editing framework in Python   It s a little dated  but maybe can give you some useful info   Using Python as PDF Editing and Processing Framework

User · Answer

I know this is an older post  but I spent a long time trying to find a solution   I came across a decent one using only ReportLab and PyPDF so I thought I d share    read your PDF using PdfFileReader    we ll call this input create a new pdf containing your text to add using ReportLab  save this as a string object read the string object using PdfFileReader    we ll call this text create a new PDF object using PdfFileWriter    we ll call this output iterate through input and apply  mergePage  text  getPage 0   for each page you want the text added to  then use output addPage   to add the modified pages to a new document   This works well for simple text additions   See PyPDF s sample for watermarking a document   Here is some code to answer the question below   packet   StringIO StringIO   can   canvas Canvas packet  pagesize letter   lt do something with canvas gt  can save   packet seek 0  input   PdfFileReader packet    From here you can merge the pages of the input file with another document

User · Answer

Example for  Python 2 7    from pyPdf import PdfFileWriter  PdfFileReader import StringIO from reportlab pdfgen import canvas from reportlab lib pagesizes import letter  packet   StringIO StringIO     create a new PDF with Reportlab can   canvas Canvas packet  pagesize letter  can drawString 10  100   Hello world   can save     move to the beginning of the StringIO buffer packet seek 0  new pdf   PdfFileReader packet    read your existing PDF existing pdf   PdfFileReader file  original pdf    rb    output   PdfFileWriter     add the  watermark   which is the new pdf  on the existing page page   existing pdf getPage 0  page mergePage new pdf getPage 0   output addPage page    finally  write  output  to a real file outputStream   file  destination pdf    wb   output write outputStream  outputStream close     Example for Python 3 x     from PyPDF2 import PdfFileWriter  PdfFileReader import io from reportlab pdfgen import canvas from reportlab lib pagesizes import letter  packet   io BytesIO     create a new PDF with Reportlab can   canvas Canvas packet  pagesize letter  can drawString 10  100   Hello world   can save     move to the beginning of the StringIO buffer packet seek 0  new pdf   PdfFileReader packet    read your existing PDF existing pdf   PdfFileReader open  original pdf    rb    output   PdfFileWriter     add the  watermark   which is the new pdf  on the existing page page   existing pdf getPage 0  page mergePage new pdf getPage 0   output addPage page    finally  write  output  to a real file outputStream   open  destination pdf    wb   output write outputStream  outputStream close

User · Answer

Leveraging David Dehghan s answer above  the following works in Python 2 7 13   from PyPDF2 import PdfFileWriter  PdfFileReader  PdfFileMerger  import StringIO  from reportlab pdfgen import canvas from reportlab lib pagesizes import letter  packet   StringIO StringIO     create a new PDF with Reportlab can   canvas Canvas packet  pagesize letter  can drawString 290  720   Hello world   can save     move to the beginning of the StringIO buffer packet seek 0  new pdf   PdfFileReader packet    read your existing PDF existing pdf   PdfFileReader  original pdf   output   PdfFileWriter     add the  watermark   which is the new pdf  on the existing page page   existing pdf getPage 0  page mergePage new pdf getPage 0   output addPage page    finally  write  output  to a real file outputStream   open  destination pdf    wb   output write outputStream  outputStream close

User · Answer

pdfrw will let you read in pages from an existing PDF and draw them to a reportlab canvas  similar to drawing an image    There are examples for this in the pdfrw examples rl1 subdirectory on github   Disclaimer  I am the pdfrw author

User · Answer

cpdf will do the job from the command-line  It isn t python  though  afaik    cpdf -add-text  Line of text  input pdf -o output  pdf

User · Answer

You may have better luck breaking the problem down into converting PDF into an editable format  writing your changes  then converting it back into PDF  I don t know of a library that lets you directly edit PDF but there are plenty of converters between DOC and PDF for example

[python] Add text to Existing PDF using Python

Examples related to python

Examples related to pdf