Adding self-hosted URL shortening to a static website

2 Janvier 2014

I really wanted thb.lt to have its own url shorteners, since (obviously) its base URL is really short. There were a few problems still: the site is static (generated with Jekyll) and I didn’t want to add messy PHP code in the source tree1. Yet I preferred a dynamic solution rather than something that would have implied to edit a file, commit and push each time I needed a short link.

The easy solution to dissociate the website and the URL shortening script is to put that script somewhere else (in my case at http://dynamic.thb.lt/), and to use a simple Apache .htaccess rule to redirect everything that looks like a short url to this script.

This implies two steps. The first one is to make sure the shortened aliases cannot conflict with “real” paths. The obvious solution is to use a specific charset. Since no single URL on this website starts with either an uppercase letter or a digit, I just configured the URL shortener to only use those (that leaves a good base 362 for encoding)

The next step is to effectively provide the redirection; for this an .htaccess is the natural way. Since Jekyll doesn’t publish dot-files (and that’s a Good Thing), one just needs to create a htaccess (without the leading dot) file at the root of the Jekyll tree, and use the permalink frontmatter option:

---
permalink: .htaccess
---

The whole magic holds in a single RedirectMatch directive:

RedirectMatch 301 ^/([0-9A-Z]+)$ http://dynamic.thb.lt/yapus?$1

I wrote my own shortener script for this, Yapus (Standing for, obviously, Yet Another PHP Url Shortener). It’s written in PHP, which is awful but works everywhere, it really sucks, and comes with no documentation at all.

This doesn’t automatically provides short URLs for this site’s pages, but that’s a matter of three lines of code if one needs such a thing.


  1. That would have worked, it just felt quite dirty; but it may be useful for other things.

  2. Another solution I tried was to limit the set of characters on the first character of the shortcut string, but that seemed a bit messy for no great advantage. Base 36 still leaves 46656 URLs of three characters or less, which is clearly less than the 238328 ones of base 62 (a-zA-Z0-9), but still much more than I need.