“Really Strategies provides us with the third-party expertise we need.”
Publishers and publishing services vendors who do composition for highly designed pages face challenges and difficulties when trying to reuse content from print composition tools. You may need to publish your print content to the web, or fulfill one of many other types of reuse requirements. Providing an XML version of your content is the logical choice, but how do you make it happen?
If you've wrestled with this scenario before, you might worry about what seems to be an inherent contradiction between the highly designed page and well structured XML content. Would you be more successful locking a designer and an engineer in a room and asking them to build something with blocks? Can this relationship even work?
Putting that initial thought aside, you might then start to worry about tinkering with your print workflow while your production is traveling 100 miles per hour. You might worry about the risk of technology being more of an impediment to getting publications out the door. You might worry about the risks and expenses of developing an XML workflow with today's technology. Where do you find the skill sets needed to work with XML?
The good news is that Adobe InDesign has made strides in enabling successful import and export of XML content. InDesign offers the ability to structure content on the page using a decent manual mapping interface and provides capabilities to help make this process more efficient. Efficiencies include:
While Adobe has provided a good set of functionality out of the box (some of which was introduced in CS2), there is still room to grow. It will be nice to see improvements such as:
Of course, InDesign alone will not allow you to create everything you need. An example is the ability to easily produce MathML for math equations in InDesign, which is critical to some publishers. Math equation editing has historically been provided by extension/plug-in developers (or clever techniques in electronic production), and both equation editing and MathML are not provided out-of-the-box. InMath from i.t.i.p now provides equation editing and other features, and many of us are eagerly waiting to see i.t.i.p. produce great things with MathML.
It is also very likely that additional transformation work will need to be applied to the exported XML to effectively load it into a content management system or to deliver it in its final form. One example is that InDesign produces a non-standard table structure. While all of the elements are there to recreate a table, the XML will require another transformation to make it standardized or otherwise useful. Really Strategies recently developed some XSLTs to automate the transformation of InDesign pages from the K4 XML export format into a valid XML and then on to HTML and the table conversion was a straightforward part of that transformation.
It should also be noted that there are alternate ways of extracting XML from InDesign, including use of InCopy article files, and export features of related tools such as the K4 workflow system or the newer versions of Adobe's Acrobat Professional. It is also possible to use AppleScript or JavaScript with InDesign to do much more complex automated tagging work.
Finally, don't expect to use InDesign as an XML editor. If you need to create complex, deeply-tagged XML that conforms 100% to a DTD or schema, you should consider ways to do so through either post-processing or significant changes to your layouts, composition processes, along with use of another layout program like FrameMaker. InDesign is still primarily a page design tool, and page design is different than creating structured content. That's why, even though we'd like to see some of the features listed above, we don't believe they are needed to derive significant value from the XML exported from InDesign today.
With all of this in mind, InDesign is certainly capable of being an integrated part of certain XML based solutions. Adobe's case studies are a good starting place. The first case details how T+L Golf tags magazine pages so that articles can be pushed onto the web. The second case details how the Texas Lawyer newspaper tags articles for repurposing into new electronic products using a content management system.
Perhaps the key observation with these case studies is that the two companies have matched the XML tagging effort to very specific repurposing objectives. While we don't know anything about the struggle to succeedand how they reached their current formatsit is clear that T+L Golf ended up requiring the minimal tagging necessary to put articles on the web. The article format is frugalwith just paragraph and style tags. While a programmer might look down at this simplicity, the resulting tagging process is much more manageable in the editorial workflow.
The lesson here is that minimizing the complexity of one's XML structure is important, considering that the effort (cost, time, and expertise) required to manually tag content in InDesign to a specific DTD grows dramatically as that DTD becomes more complex. At some point, more complex XML structural requirements will break down the ability of a web editor or other technical resources to tag that XML inside of InDesign, and make it difficult to achieve the correct tagging through automated post-processing. Reaching a proper balance is key.
As with many technology projects, a major struggle for success is to clearly articulate the business objectives. Deriving XML from InDesign for a loosely defined business purpose, might be successful (perhaps if the project is conservative), but is not likely to align fully with that objective, once it becomes fully defined. One risks developing an XML structure that is overly complex for the business purpose and therefore requires much more work to tag than is really necessarytranslating to a higher cost of production and a reduced project ROI.
The change management required to support a project is another potential struggle. The complexity of most publishing processes means that just getting other business units or departments to understand what you are trying to achieve is difficult. To get support for projects that will affect editorial, design, electronic production, and various web production groups is not easy. Working with professionals who do not have a natural interest in structured content is tough for the programmers as working with technologists who don't naturally care for editorial work is tough for the editors. Given this situation, it may be easiest to separate the tagging process from the main editorial processat least initially. T+L Golf used web editors, who are likely to have been recruited from the more technically minded editorial staff. We have seen this type of approach work for many different publishers.
Finally, it's critical to make sure that the project is managed strictly to the well defined objectives. Managers should keep in mind that technology is not as much of an art form as it used to bethere are professional practices that a competent vendor can use to minimize risks and achieve success. If a project is lined up and managed well, by those who understand your specific needs, then it is very possible to run a reasonably budgeted project.
It is important to keep in mind all of the components of an investment in managing XML in an InDesign workflow. There may be a one time investment in technology infrastructure for a CMS, or other technologies outside of InDesign itself (and ongoing maintenance and upgrades), and there may be an investment in customized software as well as scripting, such as XSLT, AppleScript, and/or JavaScript. But the other components of the investment are just as important: the change management necessary to train staff, the learning-by-doing required for workflow changes, and the ongoing costs for new staff positions.
Some careful analysis of all of these components is worthwhile before you begin. In many cases, the trick is to get as much efficiency as possible with the one time investment in technology, so that the workflow changes, added staff time, and other ongoing costs are minimized. But there may be cases where it makes sense to have a smaller technology investment because extra staffing and more manual processes will cost less than the technology infrastructure in the long run.
In any case, the time when publishers need to produce XML along with their page files is here or nearly here. Resolving the inherent conflict between designed layout and structured content is on everyone's plate. A recent conversation with Rick Ferrie, of Pearson Education, chairman of the Serving Students with Disabilities Subcommittee for AAP School Division, and member of the National Instructional Materials Accessibility Standard (NIMAS) committee sponsored by the Department of Education is just one illustration of this point: "Publishers need to provide web and other electronic versions of their products. And some publishers may have specialized requirements, like textbook publishers, who will soon be required to provide a standardized XML format in order to make their content more accessible to those with disabilities." Given the growing need for XML, it is good to know that InDesign can be counted on as an XML friendly tool.
Any publisher working on these issues should also keep in mind that the point at which the complexity of an XML structure becomes cost prohibitive within InDesign is likely to shift over time. According to Gary Cosimini of Adobe: "Regarding XML, Adobe's goal is to enable the average writer, editor or designer to take advantage of tagging and structure. This will take a few product cycles, but our plan is for anyone to be able to use and benefit from tagging content in InDesign." We can hope, therefore, that Adobe will continue to press forward in solving this engineer vs. designer problem. We can hope that Adobe will keep making the publishing industry's XML challenges easier to solve.
Consultants and analysts blog about strategy, content, and XML.
See what they are saying.