Monday, 19 January 2026

To Markdown or not to Markdown

The Griffmonster Walks has always relied upon XML workflows where the raw data is composed either in walkML, our very own flavour of XML dedicated to described routes and trails, or more recently as HTML extension to the GPX metadata. 


In either case it requires manually adding the markup to the authored content which can be time consuming. Even the walkML can take custom HTML sections in addition to the native XML format. It has historically been one of the tedious parts of the entire workflow and great use has been made of copy/paste of existing data to provide a template framework.


In addition, over the years, it has become apparent that there are no totally free or open source options to undertake the authoring and we have relied upon notepad++ using the xml plugin as our authoring tool. This is the mainstay of all development and authoring for Griffmonsters Walks.

In more recent times there has been the consideration to employ MarkDown to author the content and then to convert this to the required XML/HTML. MarkDown is a lot easier to write than to code up HTML/XML, and with many years of experience in authoring MarkDown, it seems that this may be a way forward. A similar project was undertaken some years ago during my employment days, but for whatever reason it was abandoned. Unfortunately I was not directly involved in the project and therefore was not party to why this was shelved.


So, it is time to look at this from the viewpoint of Griffmonsters Walks. Thus far, two options have presented themselves:


  1. An online solution, StackEdit https://stackedit.io/app# with HTML export
  2. An offline solution using Notepad++ with the Markdown Viewer plugin which also has a native HTML export
The idea is to author the data in MarkDown, export to HTML and use XSLT to adjust the data into the HTML code required for the walk data. This sounds fairly simple to undertake providing we use a few simple rules in the MarkDown to define the various sections of the data.

Further to this. An initial investigation has revealed more. One issue that most of the HTML export routines have is that the data is not structured. This can be overcome in a subsequent XSLT but I would rather start with a properly structured data. There are many many MarkDown editors but I am favouring Notepad++ on account that it is my goto tool for authoring and code dev.

Another tool that has been found is pandoc https://github.com/jgm/pandoc which can return structured HTML. This is a command line tool which I can integrate into an ant workflow.

This will be another little project to keep me out of trouble!

Postscript

This was supposed to be a little investigation but it has turned out to become a whole new workflow as it has gone so well. So here we go, what we have done:

  1. Having looked around at the options, Notepad++ was the most familiar and easy to use for authoring in MarkDown. It doesnt really matter as any MarkDown editor can be employed as long as the output is consistent
  2. Use Pandoc to convert MarkDown to HTML. This runs a lot better than expected, with switches to provide a template to output into, adjustment of white space, and most importantly structured HTML markup. IT even, by default, marks up images exactly how we mark up images within the blog, using the figure and figurecaption elements
  3. Use a simple XSLT to adjust the Pandoc output to the HTML required by the current pipelines, this basically adjust identifiers and classes
  4. Added another XSLT to merge the HTML into the GPX Metadata Extension
  5. Integrated this into an Ant workflow which now enables authoring, hitting the button and seeing the end result churned out ready to publish

The only caveats to this is that it does require adherence to the h1 headings which drive it. That is not a big issue as a markdown template will be a sufficient starting point. File name convention does need to be adhered to and currently the metadata still sits in the gpx file although this too could be added to markdown, although it is simply filling in fields in the gpx.

This has taken just a single day to both author a sample document and develop the workflow. I really never expected that. This will speed up the blog post generation no ends in the future. I think I deserve a drink for effort!

0 comments:

Post a Comment