As just about every C# programmer or .NET project developer, you’ll eventually need to create a PDF library. The process is similar for everyone: research the net for an existing library, try to mimic its features and advantages, write it down in your own language, choose if you want to contribute back.
Most of the time though, you will end up maintaining your own PDF library, adding new features, and fixing bugs. In this article, we will explain the benefits of contributing to existing projects instead of writing your own C# PDF Library code from scratch, and which open-source alternatives are currently available.
Converting a PDF File to a String
Before getting into the tools that you can use for development, let’s take a look at the first feature any PDF library should have. If you want to create a PDF file from scratch, you will need to deal with images and other data sources within your document.
To be able to work with these images, texts, or forms, you will need to convert them into their corresponding formats (PNG files, HTML files, ODT files).
In C# this is done using the System. Drawing namespace. This library provides you with tools to open an existing PDF file and get a string representation of it.
It’s a crucial step for any developer who wants to create their own C# PDF Library. The following snippet shows how easy it is to transform a PDF document into an image:
var pdfDocument = new PdfReader ( “Input.pdf” ); var pdfImage = (Image) Image.FromStream (pdfDocument.GetPage (1).Stream);
image2.Draw (new Rectangle (0,0 imgSource.Width, imgSource.Height ), new PointF (0, 0 ));
As you can see, in your C# PDF library, you will need to use the System. Drawing library along with a PdfReader instance or an existing PdfWriter instance in order to read and write images.
Other than that, you will need to use the System.IO namespace in order to open or save files on disk.
Depending on your needs and library features, you may want to create a new class called PdfWriter which has all the necessary properties and functions for converting images, texts, etc. into their corresponding formats (PNG files, HTML files, ODT files).
Which Open-source PDF Library Is Best For You?
After you have chosen which features your C# PDF library should provide, it’s time to look for an open-source alternative. In the following section. Let’s go over the differences between Poppler and iTextSharp, two of the most used libraries that are available for free, open-source.
Poppler comes with all the necessary tools needed to create a PDF document from scratch.
It’s an official library that has been ported to many languages such as C++, Java, and Python. This means that you can use the same code base in your C# application for reading, converting, and writing PDF documents.
iTextSharp, unlike Poppler, is not an official library. iTextSharp is the .NET port of the popular Java PDF library created by iText Group.
It’s also a great alternative to create your own C# PDF Library for many reasons:
- It’s open-source and it provides a great API that is easy to understand
- The memory footprint of iTextSharp is small in comparison with Poppler
- Poppler has additional dependencies such as Cairo which results in large DLL files (over 5MB) whereas iTextSharp is under 300KB
You can check out this resource to learn more about how to convert iTextSharp HTML to PDF.
Choosing a PDF Library Doesn’t Have to Be Difficult
In fact, it’s often much easier than people first believe. The above information will ensure that the PDF library you choose is one that can sufficiently meet your needs. So, you won’t have to worry about making mistakes in the future.
You can check out the rest of our blog for other useful information about technology.