Life, Death and XML
Professional Publishing

The Joy of Reuse

The Joy of Reuse

Reuse is an interesting practice. The idea behind it basically boils down to finding something that already exists that can be used to fill a need. If everything works out you can avoid the sweat and anguish that invariably comes with creating something anew. Seems pretty straightforward. Of course, nothing is quite as simple as it appears. And reuse is no different in this regard.

Actually, reuse as a practice and as an idea comes in many different shades. I had considered naming this post “Fifty Shades of Reuse” but I figured that if I was going to invite, and then disappoint, a new bevy of visitors to my blog it’s probably better that they have a range of interests including a love of cooking. Maybe then there is a chance that some of these new visitors, or at least those who have prior sense for handling ingredients and tools, will tolerate what they encounter here.

As with almost everything I address myself to, reuse can be considered on an abstract, theoretical level. On this level, it is possible to ask whether or not there is that much around that can be considered genuinely novel. Absolute creativity, it turns out, is extraordinarily uncommon because in almost all cases of creativity there are a variety of precedents and inputs that together make up whatever is being unveiled as new and revolutionary. This is true of documents, works of arts, manufactured consumer products, political convictions, or whatever. There is, as I have been known to say, always an antecedent.

Entertaining as some of this may be, I find that it is best to move on quickly to more practical considerations. One of the shades of reuse that gets recurrent attention is “content reuse”. This essentially zeroes in on the assertion that if you have taken the time to write, illustrate or animate something on a given topic then it only makes sense that you should reuse that content wherever it makes sense to do so. No one in their right mind, you would think, would tolerate the practice of re-creating content over and over again. And yet this bizarre practice is closer to the norm than it is to being an abhorrent behaviour that we occasionally hold up to hearty ridicule.

Wheeled Vehicle Platform

As one illustration from the past, there was a manufacturer of rugged military wheeled vehicles who had enjoyed great success with a line of vehicles that offered a number of advantages to its customers. Among the key advantages was the fact that the core platform vehicle had been consciously designed to support a plug-and-play approach to adding and subtracting specialized components to create new configurations. This would allow a customer to tailor their fleet to meet their unique needs without incurring the major support and logistics headaches that would normally come with such a range of capabilities. Reuse of system components, within a framework that facilitates this strategy, was an approach to designing complex equipment systems that was rapidly gaining ground – and this was over 20 years ago.

When I appeared leading a team to review the documentation practices on this project, we expected that their content would likewise be modular and geared to reuse so it could be efficiently reconfigured to align with whatever selections the customer made. It seemed obvious. But for a number of reasons this was not the case. I recall asking the question, slowly and deliberately during a review of their documentation practices: “Let me get this straight, you rewrite the content for each of the configurations that your customers select?” They had an answer ready: “Not exactly. We do copy and paste as much text and as many illustrations as we can.” I think my immediate response betrayed some frustration on my part. “How charming” is what I recall saying. And this I followed with “How do you track down the content that will need to be modified when something changes in the base platform configuration?

It was around this time that we put forward some suggestions on how they could modernize and rationalize their content holdings so that it really could be produced, managed and maintained in a way that mirrored the design of their equipment. The savings would be massive because the level of content reuse across the different configurations hovered between 60% and 90% depending on the configuration. What this meant was that once a base configuration was completely documented, and its content was rationalized internally to maximize the reuse of standardized text and illustration, then it would provide up to 90% of the content needed to complete the documentation for another configuration. The savings and efficiencies were staggering when compared to what was being done at the time with tedious cutting-and-pasting.

When I sketched out the alternative to their current practices, the senior manager overseeing this documentation group was skeptical and used the tried and true method of feigning management prudence and requiring that a more detailed business case be prepared before any actions would be taken. By this time, I had grown very tired of this rhetorical posture and being a brash young buck I immediately fired back with “No, I don’t think so. I have a better idea. I would like you to prepare a business case that explains how wasting time and money by creating and maintaining redundant content is a good idea. Please also include information on why the reduced equipment availability, elevated repair costs, and increased safety risk to operators are an acceptable outcome of your current practices.” As occurred on each occasion when I deployed this counter-argument, in the military and elsewhere, things got a little animated at around this time.

Angry Management

So it is that content reuse is one side to the practice of reuse and one that merits a lot of attention. It is fully astounding at times how much improvement can be made through the application of even a little bit of discipline to how content is designed and managed so that reuse can be done efficiently and effectively. There are many more sides to reuse, such as technology reuse which, as in the example of our wheeled equipment vehicle above, can be deployed to make systems more scalable, more extensible and more maintainable. But for now, I would like to keep my eye on the joys of content reuse.

DITA Metrics 101

And this brings me to the relatively new book DITA Metrics 101 by my colleague and good friend Mark Lewis. It is a practical guide to identifying, and more importantly quantifying, the specific ways in which content reuse can deliver tangible benefits to an organization. Among the things that recommends this book is that it really does zero in on the details that a documentation team can use to enumerate how they will leverage content reuse to streamline their activities, to save money and to deliver higher quality information products. The practical examples provided in DITA Metrics 101 equip readers with a framework to plan out their approach to modernizing and rationalizing their content, and to prepare the associated business case. It is certainly to be recommended over my more confrontational and definitely more explosive rhetorical manoeuvres as touched upon in the above project recollection. Also associated with the DITA Metrics 101 book is a set of spreadsheet templates that can be acquired separately so that a project team can jump right in and start exploring their future benefits. See ditametrics101.com for more information.

Now the name of Mark’s book DITA Metrics 101 does highlight one acronym that we should deconstruct for those readers who may not be familiar with it. This is DITA, or the Darwin Information Typing Architecture. I will need to come back to DITA on a separate occasion, but one story can help to situate it for the uninitiated. I was walking by the river in the parkland behind my house when I was confronted by the last question you would expect in such a location – “What is DITA?” Perhaps because I was caught off guard, I blurted out the first thing that came to mind – “It’s a collection of recycled SGML dirty tricks that help people to handle content that has been optimized for reuse.” Needless to say, this answer just made things worse for my sylvan inquisitor.

Sometimes, however, the first answer is better than it first appears. Basically by invoking the memory of SGML (Standard Generalized Markup Language), I was summoning up a key truth that a generation of content technology specialists had invested great energy and imagination in addressing the challenges that emerge when you really try to rationalize your content so as to bring to bear effective automation in support of a modernized content process. And the people who were behind the origins of DITA were definitely veterans from these earlier adventures and DITA reflects this heritage in many ways. Now some DITA features are not what we would usually term beautiful or elegant, hence my use of the phrase SGML dirty tricks, but they do have the advantage that they work when deployed as part of the overall DITA solution framework. And in general, we shouldn't quibble too much about something that works. 

And this brings us to another form of reuse – the reuse of past experience and evolving knowledge that is really only available from a community of practitioners who are organized in a way that facilitates sharing. The DITA community fits this description. So when any organization takes up the DITA standard and its associated reference solution framework they are in effect reusing an approach that has a long heritage. And by their efforts, they will hopefully contribute new insights to the store of reusable knowledge that will be available to those that come after them.

The Virtuous Cycle of Reuse

Epilogue

At the CM Strategies / DITA North America conference, convened in Providence RI in April of 2013, I delivered a presentation on this topic. There was a little extra excitement when my presentation was interrupted 5 minutes in by an evacuation order due to a security situation just outside the conference center. We were able to reconvene after a few minutes and we started more or less where we left off. I noticed, after the fact, that the interruption did in fact knock me off stride and while I had plenty to say on the topic (as usual) I did not return to all the points that I had planned out in my notes. I have consequently promised to prepare an article for the Center for Information Development Management (CIDM) newsletter. The slides - slightly augmented - are below...

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Scott Abel

As usual, great content. Thanks for sharing, Joe!

Don Day

"SGML dirty tricks" were used in DITA out of necessity, Joe, because the then-new XML spec was not quite complete enough to deserve the X for "eXtensible." In fact, out of the highly popularized X-troika (XSL, XLink and XPointer) of follow-on standards, only XSL-T (for transform) was reliably functional at the time, so it was explored deeply for whatever it could do to provide, in standards-compatible way, some of the richness of SGML in lieu of the other nascent standards. Using @spec (later changed to @class) as an architectural form identifier tied to an absolute base class enabled the eXtensibility that we felt was missing in XML, and an XSL-T trick for copying XML "snippets" became the basis for the missing SGML conref feature (which met the need for general entities, which we could see were unusable for the distributed page model implicit in XML for the Web, and which didn't exist in XML Schema).

Thus the earliest talking points about the architecture that I recall Michael Priestley expressing were that DITA provided reuse of content (via conref and topicref), reuse of design (by referencing prior classes and their content models as the basis for new elements and their content) and reuse of processing (the ability of XSL-T templates and CSS properties to trigger on the base class if nothing else, ensuring always-reasonable default processing of newly-specialized elements). Dave Schell, the development manager who sponsored the DITA design activity, would often say, "and DITA is all about reuse." Dirty tricks indeed!

Joe Gollner

Hi Don

I would add one important point here and that is that in my vernacular the epithet "dirty tricks" is a compliment. It's the sort of thing that "old hands", or as the Germans delightfully put it "old rabbits", do in order to get things done and to save a little energy and grief doing so.

I can't resist one story. It was 2004, and I received a visit from an enthusiastic sales resource from an XML product vendor. The question this individual repeated, to the point of being annoying, was "how much DITA experience does your team have?" I had to smile. As it was only 2004, there was not much DITA experience around simply because there were not many DITA implementations around. But the question conveyed some interesting things about my innocent inquisitor. To this individual, DITA was something akin to manna from heaven as it seemed to supply a powerful religious sign that would finally bring unity and momentum to a fractured content management industry that had grown tired to wrestling with so much complexity.

Now having been around awhile, I took a look at DITA and recognized almost everything I saw in it - from the delightful reuse of the term conref through to the echoes of HyTime architectural forms. All this I took as exceedingly good signs even if the precise implementation was different than what we were in the habit of doing in the good old days of SGML.

I recognized all these items precisely because in the second generation defense standards I had worked on under the CALS (Continuous Acquisition and Lifecycle Support) umbrella we had similarly pursued reuse to almost every possible conclusion. And our implementations worked remarkably well, I must say. In fact, one of our strict rules was constraining the natural tendency of technical resources to gravitate towards self-defeating sophistication by setting exceptionally harsh tests for the standards we were creating and their associated reference implementations. One such constraining test was that validation and resolution of all content artifacts must be something we could do on a 286 computer, running only DOS and a conformant public domain SGML parser. An admittedly harsh test but it had the desired effect. Our final reference implementation addressed a very wide range of publishing requirements and did so with essentially a tea cup full of application code. By leveraging ideas such as information typing and publishing overrides, we managed to distill literally hundreds of pre-existing DTDs into one that had fewer elements and attributes than HTML and to compress literally dozens of complex FOSI stylesheets into one articulated process leading to one modular (and multi-output) stylesheet. So in DITA I recognized a fellow traveller, albeit ten years after the work we had been doing.

And the leap into the nascent world of XML recommendations and tools definitely did complicate matters for those seeking to carry the lessons from that earlier era forward. I can definitely sympathize. In some of my more cantankerous moods, I have been heard to say that in several regards the new galaxy of XML development tactics did not do us many favours when it came to tackling the hard problems of handling content well. Elsewhere I have explored the details and implications of this phenomenon, which I dubbed "XML in the Wilderness".

So in response to your comment, Don, I would say that my reference to SGML dirty tricks is meant as a compliment to, and a major plus for, DITA. From very early on in DITA's public life as an OASIS standard, I used the argument that DITA's standing in a longer markup history is a key reason to accord it serious attention. These arguments I found myself using in environments like aerospace and defense projects where DITA did not originally receive a particularly warm reception.

Now it is true that there is another implication behind my connecting DITA back to this earlier generation of standards and implementation efforts. It does tend to deflate those people, like that strange sales resource who came to my office in 2004 essentially looking to cash in on this "new" thing, who see DITA as a revelation and as a deliverance from an earlier era of enshrowding darkness. These are the same people who, given to a form of standards hagiography, quickly descend into attacks on everything different than DITA as "proprietary XML" and other forms of heresy. As you may already know, I have been inclined to treat these people rather roughly - for their own good of course.

Don Day

Agreed, I was hearing your "SGML dirty tricks" as I hear Roger Whittaker singing "Dirty Old Town" with his interpretation of sometimes-too-close familiarity (not a bad thing!).

Your post makes me wonder, though, What is needed in XML to take reuse and contextual adaptiveness (to camp onto the newest content trend) to any higher levels, short of looking for other application-level tricks to emulate such function? I'm pleased that several vendors have been able to create XProc-based processing implementations, for example, but that only applies on the back end. The newest challenges in content architectures (which all seem to be about applying content intelligence) aren't any easier without still relying on the emulation of SGML tricks to improve how we match markup design to new requirements. And increasingly, we've got the challenge of how to incorporate the best of HTML5 and related advances in Web architectures as part of the design portfolio that we are obliged to work with.

You may sense that I'm posing that question with my OASIS DITA Technical Committee hat on... if DITA 2.0 is to be worthy successor to DITA 1.X, what might we do any differently besides falling back on these SGML dirty tricks, reliable as they may be? It would be nice if the XML standard could give us some new genetic material to work with!

Joe Gollner

You raise a very good point here Don and one that is going to have me tossing and turning for several nights (so thanks for that). In my paper on the Emergence of Intelligent Content, I painted a very optimistic picture of the opportunities that lie before us now that we have several threads potentially coming together - XML returning from the wilderness, social media models for collaboration becoming common practice, DITA bringing a welcome return of attention to core "content challenges", the proliferation in devices driving a renewed interest in multi-channel publishing, and the next generation in Web standards and architectures.

The optimism part is easy although I think that there was merit in formally declaring it. The hard work, as you point out, is creating something that is well and truly a creature of the future. I believe that there are some core architectural strategies, also reused from the past, that can help us but they will not really change the fact that hard work lies ahead. Substantial rewards though are also out there and that in itself might tell us a little something about the path to get there....

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Your Information

(Name and email address are required. Email address will not be displayed with the comment.)