[c#] Combine two (or more) PDF's

Background: I need to provide a weekly report package for my sales staff. This package contains several (5-10) crystal reports.

Problem: I would like to allow a user to run all reports and also just run a single report. I was thinking I could do this by creating the reports and then doing:

List<ReportClass> reports = new List<ReportClass>();
reports.Add(new WeeklyReport1());
reports.Add(new WeeklyReport2());
reports.Add(new WeeklyReport3());
<snip>

foreach (ReportClass report in reports)
{
    report.ExportToDisk(ExportFormatType.PortableDocFormat, @"c:\reports\" + report.ResourceName + ".pdf");
}

This would provide me a folder full of the reports, but I would like to email everyone a single PDF with all the weekly reports. So I need to combine them.

Is there an easy way to do this without install any more third party controls? I already have DevExpress & CrystalReports and I'd prefer not to add too many more.

Would it be best to combine them in the foreach loop or in a seperate loop? (or an alternate way)

This question is related to c# .net winforms pdf

The answer is


This is something that I figured out, and wanted to share with you, using PdfSharp.

Here you can join multiple Pdfs in one, without the need of an output directory (following the input list order)

    public static byte[] MergePdf(List<byte[]> pdfs)
    {
        List<PdfSharp.Pdf.PdfDocument> lstDocuments = new List<PdfSharp.Pdf.PdfDocument>();
        foreach (var pdf in pdfs)
        {
            lstDocuments.Add(PdfReader.Open(new MemoryStream(pdf), PdfDocumentOpenMode.Import));
        }

        using (PdfSharp.Pdf.PdfDocument outPdf = new PdfSharp.Pdf.PdfDocument())
        { 
            for(int i = 1; i<= lstDocuments.Count; i++)
            {
                foreach(PdfSharp.Pdf.PdfPage page in lstDocuments[i-1].Pages)
                {
                    outPdf.AddPage(page);
                }
            }

            MemoryStream stream = new MemoryStream();
            outPdf.Save(stream, false);
            byte[] bytes = stream.ToArray();

            return bytes;
        }           
    }

To solve a similar problem i used iTextSharp like this:

//Create the document which will contain the combined PDF's
Document document = new Document();

//Create a writer for de document
PdfCopy writer = new PdfCopy(document, new FileStream(OutPutFilePath, FileMode.Create));
if (writer == null)
{
     return;
}

//Open the document
document.Open();

//Get the files you want to combine
string[] filePaths = Directory.GetFiles(DirectoryPathWhereYouHaveYourFiles);
foreach (string filePath in filePaths)
{
     //Read the PDF file
     using (PdfReader reader = new PdfReader(vls_FilePath))
     {
         //Add the file to the combined one
         writer.AddDocument(reader);
     }
}

//Finally close the document and writer
writer.Close();
document.Close();

I combined the two above, because I needed to merge 3 pdfbytes and return a byte

internal static byte[] mergePdfs(byte[] pdf1, byte[] pdf2,byte[] pdf3)
        {
            MemoryStream outStream = new MemoryStream();
            using (Document document = new Document())
            using (PdfCopy copy = new PdfCopy(document, outStream))
            {
                document.Open();
                copy.AddDocument(new PdfReader(pdf1));
                copy.AddDocument(new PdfReader(pdf2));
                copy.AddDocument(new PdfReader(pdf3));
            }
            return outStream.ToArray();
        } 

Following method merges two pdfs( f1 and f2) using iTextSharp. The second pdf is appended after a specific index of f1.

 string f1 = "D:\\a.pdf";
 string f2 = "D:\\Iso.pdf";
 string outfile = "D:\\c.pdf";
 appendPagesFromPdf(f1, f2, outfile, 3);




  public static void appendPagesFromPdf(String f1,string f2, String destinationFile, int startingindex)
        {
            PdfReader p1 = new PdfReader(f1);
            PdfReader p2 = new PdfReader(f2);
            int l1 = p1.NumberOfPages, l2 = p2.NumberOfPages;


            //Create our destination file
            using (FileStream fs = new FileStream(destinationFile, FileMode.Create, FileAccess.Write, FileShare.None))
            {
                Document doc = new Document();

                PdfWriter w = PdfWriter.GetInstance(doc, fs);
                doc.Open();
                for (int page = 1; page <= startingindex; page++)
                {
                    doc.NewPage();
                    w.DirectContent.AddTemplate(w.GetImportedPage(p1, page), 0, 0);
                    //Used to pull individual pages from our source

                }//  copied pages from first pdf till startingIndex
                for (int i = 1; i <= l2;i++)
                {
                    doc.NewPage();
                    w.DirectContent.AddTemplate(w.GetImportedPage(p2, i), 0, 0);
                }// merges second pdf after startingIndex
                for (int i = startingindex+1; i <= l1;i++)
                {
                    doc.NewPage();
                    w.DirectContent.AddTemplate(w.GetImportedPage(p1, i), 0, 0);
                }// continuing from where we left in pdf1 

                doc.Close();
                p1.Close();
                p2.Close();

            }
        }

You could try pdf-shuffler gtk-apps.org


I used iTextsharp with c# to combine pdf files. This is the code I used.

string[] lstFiles=new string[3];
    lstFiles[0]=@"C:/pdf/1.pdf";
    lstFiles[1]=@"C:/pdf/2.pdf";
    lstFiles[2]=@"C:/pdf/3.pdf";

    PdfReader reader = null;
    Document sourceDocument = null;
    PdfCopy pdfCopyProvider = null;
    PdfImportedPage importedPage;
    string outputPdfPath=@"C:/pdf/new.pdf";


    sourceDocument = new Document();
    pdfCopyProvider = new PdfCopy(sourceDocument, new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));

    //Open the output file
    sourceDocument.Open();

    try
    {
        //Loop through the files list
        for (int f = 0; f < lstFiles.Length-1; f++)
        {
            int pages =get_pageCcount(lstFiles[f]);

            reader = new PdfReader(lstFiles[f]);
            //Add pages of current file
            for (int i = 1; i <= pages; i++)
            {
                importedPage = pdfCopyProvider.GetImportedPage(reader, i);
                pdfCopyProvider.AddPage(importedPage);
            }

            reader.Close();
         }
        //At the end save the output file
        sourceDocument.Close();
    }
    catch (Exception ex)
    {
        throw ex;
    }


private int get_pageCcount(string file)
{
    using (StreamReader sr = new StreamReader(File.OpenRead(file)))
    {
        Regex regex = new Regex(@"/Type\s*/Page[^s]");
        MatchCollection matches = regex.Matches(sr.ReadToEnd());

        return matches.Count;
    }
}

I know a lot of people have recommended PDF Sharp, however it doesn't look like that project has been updated since june of 2008. Further, source isn't available.

Personally, I've been playing with iTextSharp which has been pretty easy to work with.


I've done this with PDFBox. I suppose it works similarly to iTextSharp.


PDFsharp seems to allow merging multiple PDF documents into one.

And the same is also possible with ITextSharp.


Following method gets a List of byte array which is PDF byte array and then returns a byte array.

using ...;
using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;

public static class PdfHelper
{
    public static byte[] PdfConcat(List<byte[]> lstPdfBytes)
    {
        byte[] res;

        using (var outPdf = new PdfDocument())
        {
            foreach (var pdf in lstPdfBytes)
            {
                using (var pdfStream = new MemoryStream(pdf))
                using (var pdfDoc = PdfReader.Open(pdfStream, PdfDocumentOpenMode.Import))
                    for (var i = 0; i < pdfDoc.PageCount; i++)
                        outPdf.AddPage(pdfDoc.Pages[i]);
            }

            using (var memoryStreamOut = new MemoryStream())
            {
                outPdf.Save(memoryStreamOut, false);

                res = Stream2Bytes(memoryStreamOut);
            }
        }

        return res;
    }

    public static void DownloadAsPdfFile(string fileName, byte[] content)
    {
        var ms = new MemoryStream(content);

        HttpContext.Current.Response.Clear();
        HttpContext.Current.Response.ContentType = "application/pdf";
        HttpContext.Current.Response.AddHeader("content-disposition", $"attachment;filename={fileName}.pdf");
        HttpContext.Current.Response.Buffer = true;
        ms.WriteTo(HttpContext.Current.Response.OutputStream);
        HttpContext.Current.Response.End();
    }

    private static byte[] Stream2Bytes(Stream input)
    {
        var buffer = new byte[input.Length];
        using (var ms = new MemoryStream())
        {
            int read;
            while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
                ms.Write(buffer, 0, read);

            return ms.ToArray();
        }
    }
}

So, the result of PdfHelper.PdfConcat method is passed to PdfHelper.DownloadAsPdfFile method.

PS: A NuGet package named [PdfSharp][1] need to be installed. So in the Package Manage Console window type:

Install-Package PdfSharp


Here is a link to an example using PDFSharp and ConcatenateDocuments


Combining two byte[] using iTextSharp up to version 5.x:

internal static MemoryStream mergePdfs(byte[] pdf1, byte[] pdf2)
{
    MemoryStream outStream = new MemoryStream();
    using (Document document = new Document())
    using (PdfCopy copy = new PdfCopy(document, outStream))
    {
        document.Open();
        copy.AddDocument(new PdfReader(pdf1));
        copy.AddDocument(new PdfReader(pdf2));
    }
    return outStream;
}

Instead of the byte[]'s it's possible to pass also Stream's



Here is a single function that will merge X amount of PDFs using PDFSharp

using PdfSharp;
using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;

public static void MergePDFs(string targetPath, params string[] pdfs) {
    using(PdfDocument targetDoc = new PdfDocument()){
        foreach (string pdf in pdfs) {
            using (PdfDocument pdfDoc = PdfReader.Open(pdf, PdfDocumentOpenMode.Import)) {
                for (int i = 0; i < pdfDoc.PageCount; i++) {
                    targetDoc.AddPage(pdfDoc.Pages[i]);
                }
            }
        }
        targetDoc.Save(targetPath);
    }
}

There's some good answers here already, but I thought I might mention that pdftk might be useful for this task. Instead of producing one PDF directly, you could produce each PDF you need and then combine them together as a post-process with pdftk. This could even be done from within your program using a system() or ShellExecute() call.


Here is a example using iTextSharp

public static void MergePdf(Stream outputPdfStream, IEnumerable<string> pdfFilePaths)
{
    using (var document = new Document())
    using (var pdfCopy = new PdfCopy(document, outputPdfStream))
    {
        pdfCopy.CloseStream = false;
        try
        {
            document.Open();
            foreach (var pdfFilePath in pdfFilePaths)
            {
                using (var pdfReader = new PdfReader(pdfFilePath))
                {
                    pdfCopy.AddDocument(pdfReader);
                    pdfReader.Close();
                }
            }
        }
        finally
        {
            document?.Close();
        }
    }
}

The PdfReader constructor has many overloads. It's possible to replace the parameter type IEnumerable<string> with IEnumerable<Stream> and it should work as well. Please notice that the method does not close the OutputStream, it delegates that task to the Stream creator.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to .net

You must add a reference to assembly 'netstandard, Version=2.0.0.0 How to use Bootstrap 4 in ASP.NET Core No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Update .NET web service to use TLS 1.2 EF Core add-migration Build Failed What is the difference between .NET Core and .NET Standard Class Library project types? Visual Studio 2017 - Could not load file or assembly 'System.Runtime, Version=4.1.0.0' or one of its dependencies Nuget connection attempt failed "Unable to load the service index for source" Token based authentication in Web API without any user interface

Examples related to winforms

How to set combobox default value? Get the cell value of a GridView row Getting the first and last day of a month, using a given DateTime object Check if a record exists in the database Delete a row in DataGridView Control in VB.NET How to make picturebox transparent? Set default format of datetimepicker as dd-MM-yyyy Changing datagridview cell color based on condition C# Inserting Data from a form into an access Database How to use ConfigurationManager

Examples related to pdf

ImageMagick security policy 'PDF' blocking conversion How to extract table as text from the PDF using Python? Extract a page from a pdf as a jpeg How can I read pdf in python? Generating a PDF file from React Components Extract Data from PDF and Add to Worksheet How to extract text from a PDF file? How to download PDF automatically using js? Download pdf file using jquery ajax Generate PDF from HTML using pdfMake in Angularjs