How can I convert a Word document to PDF

Question

How can I convert a Word document to PDF where the document contains various things  such as tables  When trying to use iText  the original document looks different to the converted PDF  Is there an open source API   library  rather than calling out to an executable  that I can use

User · Answer

You can use Cloudmersive native Java library. It is free for up to 50,000 conversions/month and is much higher fidelity in my experience than other things like iText or Apache POI-based methods. The documents actually look the same as they do in Microsoft Word which for me is the key. Incidentally it can also do XLSX, PPTX, and the legacy DOC, XLS and PPT conversion to PDF.

Here is what the code looks like, first add your imports:

import com.cloudmersive.client.invoker.ApiClient;
import com.cloudmersive.client.invoker.ApiException;
import com.cloudmersive.client.invoker.Configuration;
import com.cloudmersive.client.invoker.auth.*;
import com.cloudmersive.client.ConvertDocumentApi;

Then convert a file:

ApiClient defaultClient = Configuration.getDefaultApiClient();

// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");

ConvertDocumentApi apiInstance = new ConvertDocumentApi();
File inputFile = new File("/path/to/input.docx"); // File to perform the operation on.
try {
  byte[] result = apiInstance.convertDocumentDocxToPdf(inputFile);
  System.out.println(result);
} catch (ApiException e) {
  System.err.println("Exception when calling ConvertDocumentApi#convertDocumentDocxToPdf");
e.printStackTrace();
}

You can get an document conversion API key for free from the portal.

User · Answer

It s already 2019  I can t believe still no easiest and conveniencest way to convert the most popular Micro oft Word document to Adobe PDF format in Java world   I almost tried every method the above answers mentioned  and I found the best and the only way can satisfy my requirement is by using OpenOffice or LibreOffice  Actually I am not exactly know the difference between them  seems both of them provide soffice command line   My requirement is    It must run on Linux  more specifically CentOS  not on Windows  thus we cannot install Microsoft Office on it  It must support Chinese character  so ISO-8859-1 character encoding is not a choice  it must support Unicode    First thing came in mind is doc-to-pdf-converter  but it lacks of maintenance  last update happened 4 years ago  I will not use a nobody-maintain-solution  Xdocreport seems a promising choice  but it can only convert docx  but not doc binary file which is mandatory for me  Using Java to call OpenOffice API seems good  but too complicated for such a simple requirement   Finally I found the best solution  use OpenOffice command line to finish the job   Runtime getRuntime   exec  soffice --convert-to pdf -outdir    path some doc      I always believe the shortest code is the best code  of course it should be understandable   that s it

User · Answer

Check out docs-to-pdf-converter on github  Its a lightweight solution designed specifically for converting documents to pdf      Why       I wanted a simple program that can convert Microsoft Office documents   to PDF but without dependencies like LibreOffice or expensive   proprietary solutions  Seeing as how code and libraries to convert   each individual format is scattered around the web  I decided to   combine all those solutions into one single program  Along the way  I   decided to add ODT support as well since I encountered the code too

User · Answer

Docx4j is open source and the best API for convert Docx to pdf without any alignment or font issue  Maven Dependencies   lt dependency gt       lt groupId gt org docx4j lt  groupId gt       lt artifactId gt docx4j-JAXB-Internal lt  artifactId gt       lt version gt 8 0 0 lt  version gt   lt  dependency gt   lt dependency gt       lt groupId gt org docx4j lt  groupId gt       lt artifactId gt docx4j-JAXB-ReferenceImpl lt  artifactId gt       lt version gt 8 0 0 lt  version gt   lt  dependency gt   lt dependency gt       lt groupId gt org docx4j lt  groupId gt       lt artifactId gt docx4j-JAXB-MOXy lt  artifactId gt       lt version gt 8 0 0 lt  version gt   lt  dependency gt   lt dependency gt       lt groupId gt org docx4j lt  groupId gt       lt artifactId gt docx4j-export-fo lt  artifactId gt       lt version gt 8 0 0 lt  version gt   lt  dependency gt   Code  import java io FileInputStream  import java io FileOutputStream  import java io InputStream   import org docx4j Docx4J  import org docx4j openpackaging packages WordprocessingMLPackage  import org docx4j openpackaging parts WordprocessingML MainDocumentPart   public class DocToPDF        public static void main String   args                     try               InputStream templateInputStream   new FileInputStream  quot D     Workspace    New    Sample docx quot                WordprocessingMLPackage wordMLPackage   WordprocessingMLPackage load templateInputStream               MainDocumentPart documentPart   wordMLPackage getMainDocumentPart                 String outputfilepath    quot D     Workspace    New    Sample pdf quot               FileOutputStream os   new FileOutputStream outputfilepath               Docx4J toPDF wordMLPackage os               os flush                os close              catch  Throwable e                 e printStackTrace

User · Answer

Using JACOB call Office Word is a 100  perfect solution  But it only supports on Windows platform because need Office Word installed    Download JACOB archive  the latest version is 1 19   Add jacob jar to your project classpath  Add jacob-1 19-x32 dll or jacob-1 19-x64 dll  depends on your jdk version  to     Java jdk1 x x xxx jre bin Using JACOB API call Office Word to convert doc docx to pdf   public void convertDocx2pdf String docxFilePath    File docxFile   new File docxFilePath   String pdfFile   docxFilePath substring 0  docxFilePath lastIndexOf   docx        pdf    if  docxFile exists          if   docxFile isDirectory               ActiveXComponent app   null           long start   System currentTimeMillis            try               ComThread InitMTA true                app   new ActiveXComponent  Word Application                Dispatch documents   app getProperty  Documents   toDispatch                Dispatch document   Dispatch call documents   Open   docxFilePath  false  true  toDispatch                File target   new File pdfFile               if  target exists                      target delete                              Dispatch call document   SaveAs   pdfFile  17               Dispatch call document   Close   false               long end   System currentTimeMillis                logger info              Convert Finished      end - start     ms              catch  Exception e                logger error e getLocalizedMessage    e               throw new RuntimeException  pdf convert failed               finally               if  app    null                    app invoke  Quit   new Variant                                  ComThread Release

User · Answer

I agree with posters listing OpenOffice as a high-fidelity import export facility of word   pdf docs with a Java API and it also works across platforms   OpenOffice import export filters are pretty powerful and preserve most formatting during conversion to various formats including PDF   Docmosis and JODReports value-add to make life easier than learning the OpenOffice API directly which can be challenging because of the style of the UNO api and the crash-related bugs

User · Answer

This is quite a hard task  ever harder if you want perfect results  impossible without using Word  as such the number of APIs that just do it all for you in pure Java and are open source is zero I believe  Update  I am wrong  see below    Your basic options are as follows    Using JNI a C  web service etc script MS Office  only option for 100  perfect results  Using the available APIs script Open Office  90   perfect  Use Apache POI  amp  iText  very large job  will never be perfect     Update - 2016-02-11 Here is a cut down copy of my blog post on this subject which outlines existing products that support Word-to-PDF in Java   Converting Microsoft Office  Word  Excel  documents to PDFs in Java  Three products that I know of can render Office documents   yeokm1 docs-to-pdf-converter Irregularly maintained  Pure Java  Open Source Ties together a number of libraries to perform the conversion   xdocreport Actively developed  Pure Java  Open Source It s Java API to merge XML document created with MS Office  docx  or OpenOffice  odt   LibreOffice  odt  with a Java model to generate report and convert it if you need to another format  PDF  XHTML       Snowbound  Imaging SDK Closed Source  Pure Java Snowbound appears to be a 100  Java solution and costs over  2 500  It contains samples describing how to convert documents in the evaluation download   OpenOffice API Open Source  Not Pure Java - Requires Open Office installed OpenOffice is a native Office suite which supports a Java API  This supports reading Office documents and writing PDF documents  The SDK contains an example in document conversion  examples java DocumentHandling DocumentConverter java   To write PDFs you need to pass the  writer pdf Export  writer rather than the  MS Word 97  one  Or you can use the wrapper API JODConverter    JDocToPdf - Dead as of 2016-02-11 Uses Apache POI to read the Word document and iText to write the PDF  Completely free  100  Java but has some limitations

User · Answer

Spire Doc for Java  it is a professional Java API that enables Java applications to create  convert  manipulate and print Word documents without using Microsoft Office You can easily convert Word to PDF with several lines of codes as follows  import com spire doc Document  import com spire doc FileFormat  import com spire doc ToPdfParameterList   public class WordToPDF   public static void main String   args            Create Document object     Document doc   new Document           Load the file from disk      doc loadFromFile  quot Sample docx quot           create an instance of ToPdfParameterList      ToPdfParameterList ppl new ToPdfParameterList           embeds full fonts by default when IsEmbeddedAllFonts is set to true      ppl isEmbeddedAllFonts true          set setDisableLink to true to remove the hyperlink effect for the result PDF page        set setDisableLink to false to preserve the hyperlink effect for the result PDF page      ppl setDisableLink true          Set the output image quality as 40  of the original image  80  is the default setting      doc setJPEGQuality 40          Save to file      doc saveToFile  quot output ToPDF pdf quot  FileFormat PDF        After running the code snippets above  all formats of the original Word document can be copied into PDF perfectly

User · Answer

unoconv  it s a python tool worked in UNIX  While I use Java to invoke the shell in UNIX  it works perfect for me  My source code   UnoconvTool java  Both JODConverter and unoconv are said to use open office libre office   docx4j docxreport  POI  PDFBox are good but they are missing some formats in conversion

User · Answer

You can use JODConverter for this purpose  It can be used to convert documents between different office formats  such as    Microsoft Office to OpenDocument  and vice versa  Any format to PDF And supports many more conversion as well It can also convert MS office 2007 documents to PDF as well with almost all formats   More details about it can be found here   http   www artofsolving com opensource jodconverter

[java] How can I convert a Word document to PDF?

Examples related to java

Examples related to pdf

Examples related to ms-word