I have been meaning to move away from Wordpress to a static site generator for
a very long time, due to:
- The slowness of WP, since every page request makes multiple database calls
due to the spaghetti code nature of WP and its plugin architecture. Caching
can help somewhat, but it has brittle edge cases.
- Its record of security holes. I mitigated this somewhat by isolating PHP as
much as possible.
- It is almost impossible to follow front-end optimization best-practices like
minimizing the number of CSS and JS files because each WP plugin has its own
My original plan was to go with Acrylamid, but about a year ago I started
experimenting with Hugo. Hugo is blazing fast because it is implemented
in Go rather than a slow language like Python or Ruby, and this is
game-changing. Nonetheless, it took me over a year to migrate. This post is
about the issues I encountered and the workflow I adopted to make it work.
Wordpress content migration
There is a migration tool, but it is far from perfect despite the author’s
best efforts, mostly because of the baroque nature of Wordpress itself when
combined with plugins and an old site that used several generations of image
gallery technology.
Unfortunately, that required rewriting many posts, specially those with photos
or embedded code.
Photo galleries
Hugo does not (yet) support image galleries natively. I started looking at the
HugoPhotoSwipe project, but got frustrated by bugs in its home-grown
YAML parser that broke round-trip editing, and made it very difficult to
get galleries with text before and after the gallery proper. The Python-based
smartcrop for thumbnails is also excruciatingly slow.
I wrote hugopix to address this. It uses a simpler one-way index file
generation method, and the much faster Go smartcrop implementation by
Artyom Pervukhin.
Broken asset references
Posts with photo galleries were particularly broken, due to WP’s insistence on
replacing photos with links to image pages. I wrote a tool to help me
find broken images and other assets, and organize them in a more rational way
(e.g. not have PDFs or source code samples be put in static/images).
It also has a mode to identify unused assets, e.g. 1.5GB of images that no
longer belong in the hugo tree as their galleries are moving elsewhere.
Password-protected galleries
I used to have galleries of family events on my site, until an incident where
some Dutch forum started linking to one of my cousin’s wedding photos and
making fun of her. At that point I put a pointed error message for that
referrer and controlled access using WP’s protected feature. That said,
private family photos do not belong on a public blog and I have other
dedicated password-protected galleries with Lightroom integration that make
more sense for that use case, so I just removed them from the blog, shaving
off 1.5GB of disk in the bargain.
Search
There are systems that can provide search without any server component,
e.g. the JavaScript-based search in Sphinx, and I looked at some of the
options referenced by the Hugo documentation like the Bleve-based
hugoidx but the poor documentation gave me pause, and I’d rather not run
Node.js on my server as needed by hugo-lunr.
Having recently implemented full-text search in Temboz using SQLite’s
FTS5 extension, I felt more comfortable building my own search
server in Go. Because Hugo and fts5index share the same Go template
language, this makes a seamless integration in the site’s navigation and page
structure easy.
Theme
There is no avoiding this, moving to a new blogging system requires a rewrite
of a new theme if you do not want to go with a canned theme. Fortunately,
Hugo’s theme system is sane, unlike Wordpress’, because it does not have to
rely on callbacks and hooks as much as with WP plugins.
RSS GUIDs and permalinks
One pet peeve of mine is when sites change platform with new GUIDs or
permalinks in the RSS feeds, causing a flood of old-new articles to appear in
my feed reader. Since I believe in showing respect to my readers, I had
to avoid this at all costs, and also put in place redirects as needed to avoid
404s for the few pages that did change permalinks (mostly image galleries).
Doing so required copying the embedded RSS template and changing:
<guid>{{ .Permalink }}</guid>
to:
<guid isPermaLink="false">{{ .Params.rss_guid | default .Permalink }}</guid>
The next step was to add rss_guid to the front matter of the last 10
articles in my legacy RSS feed.