Sign a web page with PGP

If you want your audience to be absolutely certain that the web page you just served them has not been tampered with, read on. This method may be a good way to calm down even your most paranoid readers.

TLS/SSL brings certificate signing to an entire web site. However, there are occasions where you might want an additional layer of security for individual pages.

Fewer trustees

On large corporate networks, many staff members have access to the publishing platform, making your page more vulnerable to
attacks from the inside. Hence, it makes sense to sign HTML pages containing critical information with a private key controlled
by a few trusted persons only.

While working on a HTML file to be hosted as an information page with no dependencies, I realized that I needed a script that could do three things:

Compile everything (images, web fonts, scripts, styles) into one large, distributable HTML file.
Minimize the content of the file to make it as small as possible.
Sign all the content in the HTML file with a private PGP key.

The pagesign script

Fortunately, this was very easily accomplished with PHP. I needed the gnupg extension from the PECL repository and the PHP minify application from Ryan Grove and Steve Clay.

The script is named pagesign.php. It takes your source HTML file as input parameter, together with the desired name
for the destination file, the fingerprint for the key to be used for page validation and optionally the path to a text file
with information to be inserted into the HTML code as a comment block.

I have published the source code on GitHub for anyone to use or improve on. There you will find a description on how to use the script as well.

One big security issue

Even though the whole HTML document is signed, here’s one big gotcha:

If an attacker manages to put a script element before or after the root <html> element, your browser will happily execute that code as well, even if it’s not part of the DOM (!).

Thus, an attacker might append or prepend script tags outside of the signed block, that can then access the DOM and alter the page content without breaking our PGP signature.

That’s a serious security issue! Users should therefore also inspect the HTML source to see that no code has been prepended or appended to the signed block.

Due to the security issue mentioned above, the simple approach of verifying the HTML code by merely running it through GPG is insufficient. A proper validation should include checking of both the file’s beginning (prologue) and the end (epilogue), to catch script injections outside the DOM.

Additional check points

An easy way out wold be to disallow any content before or after the signed block. However, this creates problems for browsers insisting on finding the DOCTYPE declaration at the very beginning of the file.

To allow for content outside the signed block, pagesign.php inserts three parameters at the top of the signed block:

Length: The exact total length of the HTML file, including the PGP signature.
Prologue: A SHA256 hash of everything that comes before the start of the signed block.
Epilogue: A SHA256 hash of everything that comes after the end of the signed block.

These values can easily be used to verify content outside the signed block. Since the parameters themselves are signed, they can not be altered without breaking the signature. A proper validation script should:

Check that the PGP signature corresponds to the right key and is valid.
Check that the byte length of the page equals the length value written in the signed block.
Check that the SHA256 hash of the beginning of the file matches the prologue value written in the signed block.
Check that the SHA256 hash of the end of the file matches the epilogue value written in the signed block.

All files included

Binary data such as images and web fonts are base64 encoded and embedded into the HTML file. External scripts and CSS files are moved into the HTML file as well. The HTML code is then minified and cryptographically signed.

This way, even the pixels in your images become signed. Nothing can be altered without breaking the signature.

After running the HTML source through pagesign, you can verify the web page by first using gpg to check the signature, then inspecting the source code to see that nothing was injected before or after the signed block.

If the file is uploaded to the web server, anyone can verify the signature of the file by using curl and gpg like this:

curl https://espenandersen.no/signed_page.html | gpg > /dev/null

Curl fetches the page content, the output is piped into gpg and then redirected to /dev/null to avoid outputting the HTML contents to the console. It should look something like this at the command prompt: