Wordpress to Markdown and then on to 11ty
This is a continuation of the previous post "In Search of a Better Writing Experience". In this post, I'm going to cover the steps I took to move my existing blog from WordPress to what I have now - a 11ty static blog.
Before I started the conversion process, I'd already looked at a few static site generators and had a shortlist of projects in mind. In the end, it came down to Astro and 11ty. I chose 11ty because it's focused on static site generation and you didn't need much to get going.
Just run the following command and you're off.
npx @11ty/eleventy --serve
The Plan #
The plan was simple. I already had HTML markup stored inside my WordPress blog. I just needed to convert that into markdown and 11ty would take care of the rest. That should be fairly simple. So, here are the steps I had in mind:
-
Export the WordPress blog, parse the export, and generate markdown
-
Setup and configure 11ty
-
Early lunch
Exporting & Parsing Wordpress Posts #
Exporting the WordPress content was easy enough. WordPress comes with a handy exporter and it spat out an XML file with all my post content. I used the following PHP script to parse the XML and convert it into usable 11ty templates.
<?php
require '../vendor/autoload.php';
use League\HTMLToMarkdown\HtmlConverter;
$converter = new HtmlConverter();
$xml = simplexml_load_file('wordpress-posts.xml');
$posts = [];
// For each post item inside the feed, I create a data array with all the info I need
foreach ($xml->channel->item as $item) {
$body = (string) $item->children('http://purl.org/rss/1.0/modules/content/')->encoded;
$data = [ // Wordpress uses XML namespaces, so we have to account for that
'id' => (int) $item->children('http://wordpress.org/export/1.2/')->post_id,
'title' => "\"$item->title\"",
'description' => substr(preg_replace('/\s+/', ' ', trim(strip_tags($body))), 0, 100) . '...',
'body' => $converter->convert($body),
'link' => (string) $item->link,
'created' =>(string) strtotime($item->pubDate),
'slug' => (string) $item->children('http://wordpress.org/export/1.2/')->post_name,
'status' => (string) $item->children('http://wordpress.org/export/1.2/')->status,
'categories' => [],
'tags' => []
];
foreach ($item->category as $metaItem) {
switch ((string) $metaItem->attributes()['domain']) {
case 'post_tag' :
$data['tags'][] = (string) $metaItem->attributes()['nicename'];
break;
case 'category' :
$data['categories'][] = (string) $metaItem->attributes()['nicename'];
break;
}
}
$data['categories'] = array_unique($data['categories']);
$data['tags'] = array_unique($data['tags']);
$posts[] = $data;
// Use buffering to capture the output and then create a folder and dump the contents inside it
ob_start();
require 'post_template.php';
$output = ob_get_clean();
$name = date('Y-m-d', $data['created']) . '-' . generateSlug($data['title']);
mkdir("./posts/{$name}");
file_put_contents(
"./posts/{$name}/{$name}.md",
$output
);
}
function generateSlug($string) {
$string = preg_replace('/[^a-zA-Z0-9\s]/', '', $string);
$string = preg_replace('/\s+/', '-', $string);
$string = strtolower($string);
return trim($string, '-');
}
function _e($var) {
echo $var . PHP_EOL;
}
I used the HTML to Markdown package by the PHP League to convert the HTML inside the XML to markdown.
A quick breakdown of the code:
- Loop through each item in the XML as they represent blog posts.
- The WordPress exporter uses XML namespaces, so take those into account when reading the XML items.
- The
$data
array contains all the information I need to create a single 11ty template for my blog post. - Use buffering to capture the template content and create a folder to store it.
- The file
post_template.php
contains the 11ty post template (see below for details).
I now had a tree structure like the one shown below:
Setting up 11ty #
I used this super simple starter kit to set up a blog template in 11ty. 11ty supports a bunch of templating languages. To keep things simple, I decided to go with markdown, since I can then easily convert it to HTML.
When you're creating templates, you need to be familiar with 11ty's Front Matter Data concept. Simply put, it's a way of adding metadata to your template that you can use when generating the static file.
Remember the template file that I used in the previous script? This is what it looked like:
---
title: <?=_e($data['title'])?>
description: <?=_e($data['description'])?>
date: <?=_e(date('Y-m-d', $data['created']))?>
tags:
<?php foreach ($data['tags'] as $tag) { ?>
- <?=_e($tag)?>
<?php } ?>
categories:
<?php foreach ($data['categories'] as $category) { ?>
- <?=_e($category)?>
<?php } ?>
---
<?=$data['body']?>
As you can see, my conversion script added all the meta data I needed and
it created the posts
folder which I could simply drop into my blog
starter kit.
At this point, I felt like I was on the home stretch. It felt safe enough to think about what I was going to have for lunch.
Early Lunch #
I serve up the site using npx @11ty/eleventy --serve
and it's all
downhill from there. There were quite a few broken pages and all the images were borked.
Suffice to say, there was no early lunch.
Clean Up Tasks #
Once the initial conversion was done, turned out there was a lot of cleanup still left. I listed the biggest issues I ran into, below.
Bad HTML Blocks #
Problem: The HTML that WordPress generated was fine for the most part. But there were weird-looking blocks anytime I used the blocks feature of the WordPress editor.
Solution: These needed to be manually cleaned out.
Breaking Embeds #
Problem: Some of my blog posts had embeds for things like videos and slides. These don't convert to markdown so they needed to be altered manually.
Solution: Use 11ty shortcodes and Iframely. Once I include the Iframely JS in my layout file, I can have embeds in my templates.
//Create a file _includes/shortcodes/embed.js
module.exports = async function(url) {
return `
<div class="iframely-embed">
<div class="iframely-responsive">
<a data-iframely-url href="${url}"></a>
</div>
</div>
`.replace(/(\r\n|\n|\r)/gm, "");;
}
// Then add this in your eleventy.config.js file
const shortCodeEmbed = require("./_includes/shortcodes/embed");
eleventyConfig.addShortcode("embed", shortCodeEmbed);
Once that's done, you can use it in your template like so:
{% embed "https://www.youtube.com/watch?v=HzV5bmguVCE" %}
Breaking Images #
Problem: ALL the images are broken. All the images point to the WordPress installation, which doesn't exist on that URL.
Solution: Bit of a slog. First I copied all the images to my local drive. Since I don't use that many images in a post, it was a quick manual job to copy them to the appropriate blog post folder. After that, I created a few shortcodes to help me with adding images to my posts.
//_includes/shortcodes/attribute-photo.js
module.exports = function(author = "Unknown", authorLink = null, source =
null, sourceLink = null) {
let text = "Photo by ";
if (authorLink) {
text = `${text} <a target="_blank"
href="${authorLink}">${author}</a>`;
} else {
text = `${text} ${author}`;
}
if (sourceLink) {
text = `${text} on <a target="_blank"
href="${sourceLink}">${source}</a>`;
} else if (source) {
text = `${text} on ${source}`;
}
return text;
}
//_includes/shortcodes/image-url.js
const Image = require('@11ty/eleventy-img');
const path = require('path');
module.exports = async function(image) {
let imagePath = path.join(path.dirname(this.page.inputPath), image);
let stats = await Image(imagePath, {
outputDir: "./_site/img/",
});
return stats.webp[0].url;
}
//_includes/shortcodes/captioned-photo.js
const attrPhoto = require('./attribute-photo');
const imageUrl = require('./image-url');
module.exports = async function(image, caption, author = null, authorLink = null, source = null, sourceLink = null) {
const attribution = author === null ? '' : attrPhoto(author, authorLink, source, sourceLink);
const url = await imageUrl.call(this, image);
return `
<div class="captioned-photo">
<img src="${url}" alt="${caption}">
<small>${caption} ${attribution}</small>
</div>
`.replace(/(\r\n|\n|\r)/gm, "");;
}
Once I had these, I manually updated the posts to serve images through the shortcodes.
{% captionedPhoto "./muppet.jpg", "Which Muppet Are You" %}
Almost There #
At this point, lunch was a couple of days ago and some time has passed. But I was close in terms of how I wanted the blog to behave. I was still having issues with unoptimized images, bad canonical links, and AWS Lambda issues but I was Almost Thereā¢.
In the next post, we'll cover my automation woes. I'll share the 11ty config I use.