Menu

BLOG

Nov
2022

Fill Out a PDF Form Using PHP – Tutorial

Looking for help on a project? Please feel free to contact me for a free consultation!

I recently took on a task where I had to use an HTML web form, whose back-end is PHP, to fill out a PDF form. The entire process is extremely annoying, and I was really hoping to use some kind of self-contained library that I could drop into a shared webhost for a client and be done with it. Unfortunately, at the tail end of 2022, there doesn’t seem to be any such library that handles everything. Below is my research as well as the methodology I ended up going with.

Table of Contents

Download Links

If you’d like, you can skip right to the good stuff and download a set of files here – just note that you’ll need to install PDFtk on your server to use it (spoilers?)

The Alternatives

I tried a few different options, and unfortunately the following didn’t really work for me. Maybe this list will prevent others from going down the same rabbit holes.

As you can see, the landscape is a little bleak. Lots of projects that deal with PDFs in one way or another, some with forms in mind, but none that do exactly what I want.

The Winner – PDFtk

My only solution was to use a system library, PDFtk, which would do the job for me. Basically, the goal is to dump the form data into a file format called FDF, and then to use the PDFtk library to combine the data from the FDF with the PDF form. The main caveat is that PDFtk needs to be installed on the server you’re running the code on. If you have sudo access to your system, this is easy, but if you don’t, your shared host might actually have it installed already (my client’s BlueHost server did). Just check with tech support.

Steps

So, the goal is as follows:

  1. Create a PDF form
  2. Generate an HTML form with a submit button
  3. Send the data from that form over to a PHP file
  4. Process the data and dump it into an FDF file
  5. Call PDFtk and use it to combine the FDF data and the PDF form

You can download and run the sample project from here. Note that you’ll need to install PDFtk – you can find more information about it here, or you can Google “PDFtk install” + your operating system of choice.

Create a PDF Form

I created a sample PDF using Adobe Acrobat Pro. It’s a very simple file that contains each of the types of inputs a form can have in a PDF: a textbox, checkboxes, radio buttons, and a dropdown (called a list box within Acrobat).

Of important note is the names of these fields – this is how they will be identified later on. It’s easiest not to use spaces in the names because

Create a Corresponding HTML Form

Next, I need to make a corresponding HTML file file that matches the naming scheme of the PDF file. Easy enough, check out index.html in the sample project. It’s a boilerplate HTML5 document, with a bunch of input fields. See the snippet below:

...
<div>
	<input type="checkbox" name="Checkbox_2_checkbox">
	<label for="Checkbox_2_checkbox">Checkbox 2</label>
</div>
<div>
	<input type="radio" name="Radio_Group_1_radio" value="Choice_1">
	<label for="Radio_Group_1_radio">Choice 1</label>
</div>
<div>
	<input type="radio" name="Radio_Group_1_radio" value="Choice_2">
	<label for="Radio_Group_1_radio">Choice 2</label>
</div>
...

Of note here is that each field’s name ends in _fieldtype, whether that’s _textbox, _checkbox, _radio, or _dropdown. This makes it easier to process in the next step.

Examine FDF File Format

At this point, if we were to dump the contents of $_POST in pdfFiller.php, it would look like this:

Array
(
    [Name_textbox] => asdf
    [Checkbox_1_checkbox] => on
    [Checkbox_2_checkbox] => on
    [Radio_Group_1] => Choice_1
    [Dropdown_dropdown] => Option_2
)

So, what we want to do is loop through these and dump their data into an FDF file. But what format is the FDF file in? The easiest way to determine that is actually to fill out the PDF yourself, hit save, and run the following command, replacing the data where necesary:

.\pdftk.exe .\yourPDF.pdf generate_fdf output filledOutData.fdf

That line was written for windows (hence pdftk.exe), but otherwise should work on all platforms supported by PDFtk. The output, once cleaned up a bit, is as follows:

%FDF-1.2
1 0 obj << /FDF << /Fields [
    << /V (Jake Binstein) /T (Name) >> 
    << /V /Yes /T (Checkbox_1) >>
    << /V / /T (Checkbox_2) >> 
    << /V /Choice_1 /T (Radio_Group_1) >>
    << /V (Option_2) /T (Dropdown) >> 
] >> >> endobj  trailer
<< /Root 1 0 R >>
%%EOF

We can ignore the first 2 lines and the final 3 lines, because they are just boilerplate FDF information. We’ll be sure to include them in our code, though. The real thing to focus on are the individual fields. The format appears to be, in most cases:

<< /V (Value) /T (Field Name) >>

Exceptions are checkboxes and radio buttons, which use the format /Value instead of (Value). Also note that checking a checkbox requires /Yes whereas unchecking it requires just /.

With that in mind, we can quickly create a basic PHP file that will output an FDF file.

Process HTML Form Data and Output an FDF File

What we need to do now is loop through the different key/value pairs passed ot us in $_POST, and depending on the type of field, add a string to FDF data in the correct format. Below is an example regarding a textbox, but the code files I provide cover all of the field types.

// Loop through the $_POST data, creating a new row in the FDF file for each key/value pair
$fdf = "";
foreach($_POST as $key => $value) {
    // If the user filled nothing in the field, like a text field, just skip it.
    // Note that if the PDF you provide already has text in it by default,
    // doing this will leave the text as-is.
    // If you prefer to remove the text, you should remove the lines below so you
    // overwrite the text with nothing.
    if($value == "") {
        continue;
    }

    // Figure out what kind of field it is by its name,
    // which should be in the format name_fieldtype.

    // Textbox
    if(stringEndsWith($key, "_textbox")) {
        $key = str_replace("_textbox", "", $key);
        // Format:
        // << /V (Text) /T (Fieldname) >> 

        // Backslashes in the value are encoded as double backslashes
        $value = str_replace("\\", "\\\\", $value);
        // Parenthesis are encoded using \'s in front
        $value = str_replace("(", "\(", $value);
        $value = str_replace(")", "\)", $value);

        $fdf .= "<< /V (" . $value . ")" . " /T (" . $key . ") >>" . "\r\n";
    }
    // Checkbox
    else if(stringEndsWith($key, "_checkbox")) { ... }
    // Radio Button
    else if(stringEndsWith($key, "_radio")) { ... }
    // Dropdown
    else if(stringEndsWith($key, "_dropdown")) { ... }
}

Once we have our FDF data stored safely in $fdf, we’ll need to output it to a file. I created a folder named output – you may need to change the permissions or ownership of this folder in order to allow PHP to write to it. In addition to the text in $fdf, we’ll also need the boilerplate information that I mentioned above. I hid those away in functions to make things a bit cleaner. Finally, we’ll need a filename – because I don’t want the file to overwrite itself every time the script is ran, I used a timestamp in the name of the file. You may want to add other unique elements to the file (like data from the form), or some extra checks to make sure that no such file exists already.

// Set location for FDF and PDF files
$outputLocation = "output/";
// Dump FDF data to file
$timestamp = time();
$outputFDF = $outputLocation . $timestamp . ".fdf";
$outputPDF = $outputLocation . $timestamp . ".pdf";
file_put_contents($outputFDF, $fdf);

Use PDFtk to Combine the FDF File and the PDF File

Finally, we get to the good part – using PDFtk to combine this FDF form data with the original PDF and outputting a new PDF. If you haven’t yet installed PDFtk, now’s the time – you can find more information about it here, or you can Google “PDFtk install” + your operating system of choice.

The shell command for PDFtk is as follows:

pdftk originalForm.pdf fill_form formData.fdf output filledFormWithData.pdf

Roughly translated, this means: with the PDFtk program, use the file originalFOrm.pdf. We want to fill that form, and specifically we want to fill it using the FDF data in formData.fdf. Once the form is filled, save its output in filledFormWithData.pdf

In PHP, we can execute this shell command by way of the exec() function. Finally, to make debugging a bit easier, I output not only where the PDF is on the server, but also a link back home (to easily fill the form out again), and finally I embed the new PDF as an iframe so it’s easy to see. Any of this can be easily customized.

// Location of original PDF form
$pdfLocation = "Example.pdf";
// Generate the PDF
exec("pdftk " . $pdfLocation . " fill_form " . $outputFDF . " output " . $outputPDF);

echo "<p>Done! Your application will be reviewed shortly.</p>";
echo "<p>It is stored in: " . $outputPDF . "</p>";
echo "<p><br/><a href='/'>Home</a></p>";
echo "<iframe src='" . $outputPDF . "' width='100%' height='100%'></iframe>";

Conclusion

Although it’s unfortunate that there’s no pure PHP, drop-in solution, I found PDFtk to be very easy to work with. Check out the sample project here and let me know what you think. I rearranged the code a bit to make it read nicely, and over-commented so it’s very obvious what I’m doing.

Looking for help on a project? Please feel free to contact me for a free consultation!

Leave a Comment

Your email address will not be published. Required fields are marked *

*

*