RESOURCE CENTER

“Really Strategies provides us with the third-party expertise we need.”

—Kinsey Wilson
USATODAY.com
Resource Center

Adobe's New World Order: Metadata-Driven Workflows with XMP

In 2001, Adobe published the specification for XMP, the eXtensible Metadata Platform. The platform has three components:

  1. A framework and rules for expressing (coding) metadata along with foundational sets of metadata fields (such as Dublin Core)
  2. A method for embedding XMP metadata in files
  3. A software development kit (SDK) for working with the metadata

At first glance, this platform might not seem revolutionary. But, in fact, Adobe is attempting to fundamentally change the way we think about asset management processes in a creative environment.

The "old school" approach

Publishers who create highly designed publications—magazines, newspapers, some books—have struggled to figure out how content management technologies can benefit them. Some have invested in a digital asset management system (DAMS), typically to store completed content assets. Some have invested in an editorial and production system (E&PS) to facilitate the coordination of content development with page layout. Some publishers have invested in both. Both types of systems are usually highly centralized (content must be loaded into the system to be used) and depend heavily on metadata to be useful.

There's no question that these systems provide value, but, like any centralized system, they impose restrictions on content and users that aren't always welcome. In particular, metadata must be identified early in the system requirements phase, and it can be difficult to add specialized metadata fields to a subset of content, especially if those fields don't present themselves until after the system has gone live.

  • In addition, these systems provide limited value to graphic artists and other creative professionals during the content creation phase, because metadata isn't assigned until after the content has been completed and loaded into the system. Metadata is extra work for those users rather than information that helps them.

    Bob Schaffel, the Adobe Product Manager for XMP, calls this combination of centralized systems and workflow the "old school" approach, one in which workflow is driven more by systems and their needs than by the requirements of users and content.

    The new paradigm

    Adobe doesn't believe DAMS and E&PS products should be eliminated - far from it. But they do believe that centralized solutions should be augmented by a distributed approach, that inflexible systems should be changed so they can react to legitimate content variations, and that metadata should help creative professionals rather than annoy them. XMP is their technical solution to make this possible. Here's how it works:

    By supporting a distributed environment: Content is created and manipulated in many locations before a publisher is able to load it into a management system. Photographs are created on digital cameras. Images are created on the computers of freelancers. The best time to begin capturing metadata—who created it? when? in what format?—is at these points of creation. The best place to store the metadata is in the content object itself, because there probably isn't a larger system in which to store it and because the object is what will be delivered into the publishing environment anyway. XMP makes this possible by defining methods for embedding metadata in binary file formats.

    By supporting the use of metadata by many systems and tools: If metadata travels with objects, then the tools and systems that will handle the objects should be able to work with that metadata. A graphics program like Illustrator or a digital camera should be able to create a file that contains metadata. A design product like Photoshop should be able to read the metadata, let a user add to it, and properly write it out again. A DAMS should be able to load and search the metadata. This wide usage cries out for standardization of the metadata format, and XMP provides that standardization.

    With shared and specialized metadata fields: Sometimes standardized metadata fields are essential to a particular publishing vertical. But individual businesses, content, and systems also have their own needs. XMP includes foundational and widely used metadata sets directly in the specification, and includes a method for adding metadata fields as needed to meet specific requirements. Adobe has also been taking steps to facilitate the development of XMP metadata sets for industry groups that need it. For example, IPTC (news) metadata is now expressed in XMP. See www.iptc.org/pages/index.php for details.

    By encouraging adoption by vendors and customers: Adobe knows that changing publishing tools and workflows will take a lot of time and evangelism. They're trying to win the market over in a number of ways. They spend a lot of time talking with publishers about XMP. They made the specification available under open license rather than proprietary. They added XMP support to their product suite. (Look at the File Info or Document Properties dialog in an Adobe application, especially the Advanced panel; you're looking at XMP.) They created an SDK for working with XMP in 3rd party applications.

    Adobe's applications for creative development are ever more dominant, and Adobe is serious about XMP. Given these factors and apparently wide-spread acceptance of XMP concepts, it makes sense for other players in that space to adopt the framework. Many DAMS have made product announcements about XMP support in their systems and a few actually have releases available. Vendors like Pound Hill Software have created products specifically for the development of XMP schemas and metadata entry forms. (Learn about the MetaGrove product suite at http://www.poundhill.com/.) Unfortunately, to date there appears to be no open-source work in the area of writing out XMP metadata (reading it is easier and there are plenty of tools), but that could change as interest increases.

    What next?

    Most publishers are already considering changes to capture more and better metadata. If you're one of those publishers, we suggest investigating whether XMP is an appropriate part of that change.

    If you like what you see, encourage your software vendors to add XMP support where it isn't already present—but specifically in ways that are meaningful to your redesigned workflows. In this respect, different products need different XMP capabilities. Some might only need to read XMP. Some don't need any direct awareness of XMP, but do need to be able to modify files without inadvertently corrupting or removing the XMP metadata. DAMS or E&PS products need sophisticated support for searching the metadata. Any software that reads metadata should be flexible enough to ingest any fields that conform to the XMP specification. If users can edit the metadata, the software should also facilitate the creation of custom forms for entering custom metadata.

    And, let us know if you're using XMP, and how it's going.

    Details of the XMP metadata format and schemas

    XMP is based on an XML expression of RDF. RDF is the Resource Description Framework, a W3C specification for defining metadata vocabularies.

    XMP must therefore follow the rules of XML and RDF rules for how to associate a subject with a predicate and object. For example, in the metadata description, "This article was authored by Sara Smith," article is the subject, authored by is the predicate, and Sara Smith is the object. An RDF XML expression of this might look like:

       <rdf:Description rdf:about="http://www.publisherdomain.com/article123/">
          <dc:creator>Sara Smith</dc:creator>
        </rdf:Description>

    rdf and dc indicate that the element names come from the RDF and Dublin Core namespaces. You can add your own elements and namespaces as needed.

    The XMP implementation of RDF does not use rdf:about because the subject is always the containing object (file). Adobe makes many common sense decisions like these in XMP, and XMP purists might occasionally disagree with them. For example, sequence indicators are sometimes inserted where they are not technically needed (if you add one creator, they are the first in the sequence rather than simply a single creator). XMP imposes restrictions where RDF would not (such as limiting a file to a single creation date). In our view these choices make practical sense for the context in which this metadata is to be used.

    XMP also includes foundation metadata sets. They are:

    • Dublin Core (a widely-used set of common metadata fields)
    • XMP Basic Schema
    • XMP Rights Management Schema
    • XMP Media Management Schema
    • XMP Basic Job Ticket Schema
    • XMP Paged-Text Schema
    • XMP Dynamic Media Schema
    • Adobe PDF Schema
    • Photoshop Schema
    • Camera Raw Schema
    • EXIF Schemas (an image format used in digital cameras)

    Including these schemas directly in the specification muddies the water a bit; typically a meta-standard like XMP is defined separately from the vocabularies that leverage it. For example, if one of these schemas needs to be extended for broad use, should the XMP specification itself be extended? Also, there's plenty of other standardization to be done in the world of schemas, as is illustrated by Adobe's support of IPTC and other efforts - should they become part of the specification? Regardless, it doesn't really matter. If these are good schemas, they will be found useful and adopted. If not, XMP allows extensions so that other groups can implement and even share their own schemas.

    Note that the specification does not actually include formal schemas for these metadata sets, but rather tables of field descriptions. To that extent the term "schema" is something of an overstatement. Because the W3C XML Schema language cannot effectively express RDF constraints, a different language (maybe RelaxNG) would be used if Adobe did choose to publish a formal schema.

    Privacy Policy |  Register |  Unsubscribe

    Really Strategies' Blog

    Consultants and analysts blog about strategy, content, and XML. See what they are saying.

    Newsletter Index

    General Publishing

    Composition

    Collaboration

    Content Management

    Licensing/Syndication

    Rich Data Products

    Semantic Technology

    Software Development

    Standards

    XML Editing

    Production

    Oh Really! 5 Questions With...

    Inside the Brackets