Tag Archives: tech-writing

Why Markdown is not my favorite text markup language

origami-galerie-freising-tomoko-fuseThere are many text markup languages that purport to allow you to write in a simple markup format and publish to the web. Markdown has arguably emerged as the “king” of these formats. I quite like it myself when it’s used for writing short documents with relatively simple formatting needs. However, it falls a bit short when you start to do more elaborate work. This is especially the case when you are trying to do any kind of “serious” technical authoring.

I know that “Markdown” has been used to write technical books. Game Programming Patterns is one excellent example; you can read more about the author’s use of Markdown here, and the script he uses to extend Markdown to meet his needs is here. (I recommend reading all of his essays about how he wrote the book, by the way. They’re truly inspiring.). Based on that author’s experience (and some of my own), I know that Markdown can absolutely be used as a base upon which to build ebooks, websites, wikis, and more. However, this is exactly why I used the term “Markdown” in quotes at the beginning of this paragraph. By the time you’ve extended Markdown to cover your more featureful technical authoring use cases, it really isn’t “just” Markdown anymore. This is fine if you just want to get something done quickly that meets your own needs, but it’s not ideal if you want to work with a meaningful system can be standardized and built on.

Below I’ll address just a few of the needs of “industrial” technical writing (the kind that I do, ostensibly) where Markdown falls a little short. Lest this come off as too negative, it’s worth stating for the record that a homegrown combination of Markdown and a few scripts in a git repo with a Makefile is still an absolute paradise compared to almost all of the clunky proprietary tooling that is marketed and sold for the purposes of “mainstream” technical writing. I have turned to such a homebrewed setup myself in times of need. I’ve even written about how awesome writing in Markdown can be. However, this essay is an attempt to capture my thoughts on Markdown’s shortcomings. Like any good internet crank, I reserve the right to pull a Nickieben Bourbaki at a later date.

I. No native table support

If you are doing any kind of large-scale tech docs, you need tables. Although constraints are always good, and a simple list can probably replace 80% of your table usage if you’re disciplined, there are times when you really just need a big honkin’ table. And as much as I’m used to editing raw XML and HTML directly in Emacs using its excellent tooling to completely sidestep the unwanted “upgrade” to the Confluence editor at $WORK, most writers probably don’t want to be authoring tables directly in HTML (which is the “native” Markdown solution).

II. No native table of contents support

Yes, I can write a script myself to do this. I can also use one of the dozens of such scripts written by others. However, I’d rather have something built in, and consider it a weakness of the format.

III. Forcing the user to fall back to inline HTML is not really OK

Like tables, there are a number of other formatting and layout use cases that Markdown can’t handle natively. As with tables, you must resort to just slapping in some raw HTML. Two reasons why this isn’t so amazing are:

  • It’s hard for an editor to support well, since editing “regular” text markup and tag-based markup languages are quite different beasts
  • It punts complexity to thousands of users in in order to preserve implementation simplicity for a small number of implementors

I can sympathize with the reasoning behind this design decision, since I am usually the guy making his own little hacks that meet simple use cases, but again: not really OK for serious work.

IV. Too many different ways to express the same formatting

This has lead to a number of incompatibilities among the different “Markdown” renderers out there. Just a few of the areas where ambiguity exists are: headers, lists, code sections, and links. For an introduction to Markdown’s flexible semantics, see the original syntax docs. Then, for a more elaborate description of the inconsistencies and challenges of rendering Markdown properly, see Why is a spec needed?, written by the CommonMark folks.

V. Too many incompatible flavors

There are too many incompatible flavors of Markdown that each render a document slightly differently. For a good description of the ways different Markdown implementations diverge, see the Babelmark 2 FAQ.

The “incompatible flavors” issue will hopefully be addressed with the advent of the CommonMark Standard, but if you read the spec it doesn’t address points I, II, or III at all. This makes sense from the perspective of the author of a standards document: a spec isn’t very useful unless you can achieve consensus and adoption among all the slightly different implementations out there right now, and Markdown as commonly understaood doesn’t try to support those cases anyway.

VI. No native means of validation

There will of course be a reference implementation and tests for CommonMark, which will ensure that the content is valid Markdown, but for large-scale documentation deployments, you really need the ability to validate that the documentation sets you’re publishing have certain properties. These properties might include, but aren’t limited to:

  • “Do all of the links have valid targets?”
  • “Is every page reachable from some other page?”

Markdown doesn’t care about this. And to be fair it never said it would! You are of course free to use other tools to perform all of the validations you care about on the resulting HTML output. This isn’t necessarily so bad (in fact it’s not as bad as points I and II in my opinion, since those actually affect you while you’re authoring), but it’s an issue to be aware of.

This is one area where XML has some neat tooling and properties. Although I suppose you could do something workable with a strict subset of HTML. You could also use pandoc to generate XML, which you then validate according to your needs.

Conclusion

Markdown solves its original use case well, while punting on many others in classic Worse is Better fashion. To be fair to Markdown, it was never purported to be anything other than a simple set of formatting conventions for web writing. And it’s worth saying once more that, even given its limitations, a homegrown combination of Markdown and a few scripts in a git repo with a Makefile is still an absolute paradise compared to almost all of the clunky proprietary tooling that is marketed and sold for the purposes of “mainstream” technical writing.

Even so, I hope I’ve presented an argument for why Markdown is not ideal for large scale technical documentation work.

(Image courtesy Gerwin Sturm under a Creative Commons license.)

Advertisements

Applying Lean Principles to the Documentation Lifecycle

pipes

Earlier, I promised to post my notes from talks I attended at the 2014 STC Summit. This talk, by Alan Houser, was probably the most impactful of the Summit for me. The tl;dr version is simply this: Find out what your customers value, and spend your time doing that.

Below is a lightly edited version of the notes I took during the session. The content of the talk is copyright Mr. Hauser, and any errors are mine.

Big Ideas

  • Build/measure/learn
  • get out of the building
  • minimum viable product
  • pivot

How much of what we do truly provides value to the customer?

What we care about

  • deliverables
  • schedules
  • tools
  • org structure
  • office politics
  • legacy file formats

What customers care about

  • can i find it?
  • does it help me?

The Pivot

Can we, based on data, adjust what we do?

“We’ve always done it this way”.

How Companies Pivot

  • budget cuts
  • re-org
  • reduction in force

What Works?

Do That.

What Doesn’t?

Don’t Do That.

What do you measure?

  • pages?
  • topics?
  • words/topic?
  • word count of doc set
  • average word count of headings?
  • readability score?
  • hours/topic?
  • percentage of reuse?
  • revisions/time
  • customer views/topic
  • number of unique words

Do You Get Out of the Building?

What is Waste?

  • things that don’t provide customer value
  • waste time, money, resources, focus
  • (some orgs try to do too much)
  • let’s document this corner case
  • let’s adjust this formatting
  • let’s deliver a CHM file

Let It Go!

Are you continually asking: How does this provide value?

Do you pivot when your process is not aligned with customer value?

Rocky Balboa did two things in the story:

1. Transformed himself

2. Massively Exceeded Expectations

How to exceed expectations?

1. learn something new

2. try something different

3. talk to customers

4. measure something you haven’t before

(Image courtesy dirtyf under Creative Commons License)

Thoughts on the 2014 STC Summit

This is a collection of random thoughts based on my attendance at the 2014 STC Summit earlier this week. I will try to post my more detailed notes from the various individual talks over the next days and weeks.

Lots of proprietary tools, not so much open source

There are lots of proprietary document creation and management tools, and their vendors seem to be well-represented here. Coming from a hybrid tech-writing/programming background, I have to admit that some of the proprietary solutions looked sort of strange to me. Many appeared to be Windows-only to boot.

It seems like there is a lot of opportunity for open-source software to make inroads here. I wonder what it would take to bring the XML-editing capabilities of open source editors like Emacs and Vim up to date (if they aren’t already) to match the capabilities of proprietary tools like Framemaker, Oxygen editor, Madcap Flare, and the like.

There are a few reasons why open source and tech comm could be a match made in heaven. Especially when you consider that whatever improvements in process or tooling you create in open source environments are yours to keep, free of charge, forever. This is definitely not the case in proprietary environments. When you develop your own automation and tooling against proprietary tools, and the vendor breaks stuff, you’re often out of luck.

DITA and XML

It turns out that XML and the transformation tools that work on it such as XSLT and friends are pretty powerful. I have been aware of the existence of these technologies but haven’t used them much thus far in my career.

I feel like I understood the appeal of the DITA XML spec/style better after attending a great talk given by Caitlin Cronkhite and Ted Kuster from Salesforce. As I understood them to say, DITA is just a way of structuring your XML into topics that other tools can then use to create your documentation set with minimal repetition on the part of the writer. (I will put up my notes from that talk in another post.)

However, I admit I still don’t fully understand the reason for layering the proprietary environments over the top of the structure provided by DITA. I would probably prefer to author directly in XML using Emacs and nXML-mode. Alternatively, I’d use a markup language, such as a Markdown variant, that could be translated to XML with a script much like my own confluence2html, and build the various document sets I needed using Makefiles.

Key Takeaway: Do More Professional Development

The number one lesson from this trip was that I have a lot to learn (this should be evident from the preceding paragraphs). There are so many tools and techniques out there that I am not aware of. I’ve only been at this tech writing gig for a couple of years now, after all.

I look forward to engaging more tech writers working in other industries to learn about how they do what they do. My hope is that this will allow me to develop my own skills by stealing some of their best ideas while sharing some of my own crazy notions as well.