Copyright 2006 Robert J. Glushko
The Interoperability Problem
How Bad Can it Be?
Attacking the Interoperability Problem
Hub Languages
The ease with which you can use XML to create a new vocabulary means it is easy to create a poor one
No way around the classical problems of classification and naming we know from philosophy, linguistics, cognitive psychology, and information science
XML is NOT "self-describing"
There are often multiple vocabularies for the same or related domains and especially for the common information models that are used in more than one domain
The vocabulary problem implies an interoperability problem
This means that two applications or services can't use each other's models or document instances "as is"
Some interoperability problems can be detected and resolved by completely automated mechanisms
Other problems can be detected and resolved with some human intervention
Other problems can be detected but not resolved
Some problems can go undetected
Vertical:
Particular industry or vertical market
Detailed product semantics
Specialized process semantics
Sometimes called "domain-specific" languages
Horizontal
Concepts that are common to all (or a large number of) vocabularies
Each new XML vocabulary for a particular industry is a step forward for that community, but proliferates definitions of information models that are common to many of them
Since the distinctive or specialized parts of each vocabulary are the industry-specific "vertical" parts, a lot of attention gets paid to them
In contrast, relatively less effort is given to the "horizontal" parts that seem more familiar or understandable
Nevertheless, any large company – even highly verticalized ones – engages in diverse business activities that require it to understand multiple vocabularies at different times
Different industries need specialized terms and properties to describe their products and processes, but if they don't build them on top of a standard base document, there will be no interoperability across industries

Suppose you publish your web service interface description and tell the world "my ordering service requires a purchase order that conforms to this schema"
This says "send me MY purchase order" not "send me YOUR purchase order"
How likely is it that the purchase orders being used by other firms will be able to meet your interface requirement, either directly or after being transformed?


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="Order" type="OrderType"/>
<xs:complexType name="OrderType">
<xs:sequence>
<xs:element name="BuyersID" type="xs:string"/>
<xs:element name="BuyerParty" type="PartyType"/>
<xs:element name="OrderLine" type="OrderLineType"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="PartyType">
<xs:sequence>
<xs:element name="ID" type="xs:string"/>
<xs:element name="PartyName" type="PartyNameType"/>
<xs:element name="Address" type="AddressType"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="PartyNameType">
<xs:sequence>
<xs:element name="Name" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="AddressType">
<xs:sequence>
<xs:element name="Room" type="xs:string"/>
<xs:element name="BuildingNumber" type="xs:string"/>
<xs:element name="StreetName" type="xs:string"/>
<xs:element name="CityName" type="xs:string"/>
<xs:element name="PostalZone" type="xs:string"/>
<xs:element name="CountrySubentity" type="xs:string"/>
<xs:element name="Country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="OrderLineType">
<xs:sequence>
<xs:element name="LineItem" type="LineItemType"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="LineItemType">
<xs:sequence>
<xs:element name="BookItem" type="BookItemType"/>
<xs:element name="BasePrice" type="xs:decimal"/>
<xs:element name="Quantity" type="xs:int"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="BookItemType">
<xs:sequence>
<xs:element name="Title" type="xs:string"/>
<xs:element name="Author" type="xs:string"/>
<xs:element name="ISBN" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
<Order>
<BuyersID>91604</BuyersID>
<BuyerParty>
<ID>KEEN</ID>
<PartyName>
<Name>Maynard James Keenan</Name>
</PartyName>
<Address>
<Room>505</Room>
<BuildingNumber>11271</BuildingNumber>
<StreetName>Ventura Blvd.</StreetName>
<CityName>Studio City</CityName>
<PostalZone>91604</PostalZone>
<CountrySubentity>California</CountrySubentity>
<Country>USA</Country>
</Address>
</BuyerParty>
<OrderLine>
<LineItem>
<BookItem>
<Title>Foucault's Pendulum</Title>
<Author>Umberto Eco</Author>
<ISBN>0345368754</ISBN>
</BookItem>
<BasePrice>7.99</BasePrice>
<Quantity>1</Quantity>
</LineItem>
</OrderLine>
</Order>
<Customer> <Number>KEEN</Number> <Name> <BusinessName>Maynard James Keenan</BusinessName> </Name> <Location> <Unit>505</Unit> <StreetNumber>11271</StreetNumber> <Street>Ventura Blvd.</Street> <City>Studio City</City> <ZipCode>91604</ZipCode> <State>California</State> <Country>USA</Country> </Location> </Customer>
<Acheteur> <ID>KEEN</ID> <Nom> <NomCommercial>Maynard James Keenan</NomCommercial> </Nom> <Addresse> <Appartment>505</Appartment> <Bâtiment>11271</Bâtiment> <Rue>Ventura Blvd.</Rue> <Ville>Studio City</Ville> <CodePostal>91604</CodePostal> <Etat>California</Etat> <Pays>USA</Pays> </Addresse> </Acheteur>
<BuyerParty ID="KEEN" Name="Maynard James Keenan" Room="505" BuildingNumber="11271" StreetName="Ventura Blvd." City="Studio City" State="California" PostalCode="91604" >
<Address> <StreetAddress>11271 Ventura Blvd. #505</StreetAddress> <City>Studio City 91604</City> <CountrySubentity>California</CountrySubentity> <Country>USA</Country> </Address> <PartyName> <FamilyName>Keenan</FamilyName> <MiddleName>James</MiddleName> <FirstName>Maynard</FirstName> </PartyName>
<BuyerParty> <ID>KEEN</ID> <PartyName> <Name>Maynard James Keenan</Name> </PartyName> <Address> <Room>505</Room> <BuildingNumber>11271</BuildingNumber> <StreetName>Ventura Blvd.</StreetName> <CityName>Studio City</CityName> <PostalZone>91604</PostalZone> <CountrySubentity>California</CountrySubentity> <Country>USA</Country> </Address> </BuyerParty>
<Order>
<BuyersID>91604</BuyersID>
<BuyerParty>
<ID>KEEN</ID>
</BuyerParty>
<OrderLine>
<LineItem>
<BookItem>
<Title>Foucault's Pendulum</Title>
<Author>Umberto Eco</Author>
<ISBN>0345368754</ISBN>
</BookItem>
<BasePrice>7.99</BasePrice>
<Quantity>1</Quantity>
</LineItem>
</OrderLine>
</Order>
<Address> <Latitude direction="N">37.871</Latitude> <Longitude direction="W">-122.271</Longitude> </Address>
The names are the same but the semantics aren't
<BuyerParty> <ID>555-22-1234</ID> <Address> <Room>505</Room> <BuildingNumber>11271</BuildingNumber> <StreetName>Ventura Blvd.</StreetName> <CityName>Studio City</CityName> <PostalZone>91604-3136</PostalZone> <CountrySubentity>California</CountrySubentity> <Country>USA</Country> </Address>
After all these cases where interoperability may or may not possible because the conceptual or implementation models differ we need to talk about the "easy" case ... and make sure you recognize that it might not be
Suppose the document validates against the recipient's schema
The semantics can still be different in important ways (the ID SSN example) – the strongest level of validation can fall short of establishing that the "same tags" have exactly the "same meaning" to the sender and recipient
Furthermore, the recipient may not be able to validate all of the business rules that are important
This is a good argument for industry standards / reference models / in your conceptual models or using XML vocabularies that represent them in authoritative ways
Everyone has to learn to "speak" all the languages – clearly impractical
Everyone has to learn just one language but it has to be the same one
Multiple vocabularies exist, but there is at least one "interchange" or "hub" language designed to facilitate translations between "native" vocabularies

Information components should be defined using standard semantics for precise datatyping and validation
Information components must have an "appropriate" granularity for reuse
Should not assume any specific protocol for message delivery
Should facilitate internationalization and localization
(early 1990s) - Ad hoc efforts in EDIFACT to "harmonize" core components across verticals
1997- XML Common Business Library is 1st XML horizontal vocabulary, incorporated EDIFACT semantics and code lists
1999 - ebxml initiative of EDIFACT and OASIS to develop syntax-neutral "core components"
2001 - Universal Business Language effort begins, building on xCBL and ebXML Core Components
DOCUMENT ARCHITECTURE: A generic XML interchange format for business documents that can be extended to meet the requirements of particular industries
CORE COMPONENTS: A library of XML schemas for reusable data components such as "Address," "Item," and "Payment" -- the common data elements of everyday business documents
STANDARD DOCUMENTS: A small set of XML schemas for common business documents such as "Order," "Despatch Advice," and "Invoice" that are constructed from the UBL library components and can be used in a generic order-to-invoice trading context
METHODOLOGY R&D: UBL components and documents developed with evolving Document Engineering methodology (Tim McGrath headed "library content" effort)




If all parties/applications/services rely on a hub language for their external interfaces, an exponential interoperability challenge becomes a linear one
Mapping tools for transforming instances from an internal information model to another one are ubiquitous as standalone tools and as parts of application servers
EXAMPLE: Altova MapForce
There are a large number of ways that two implementation models that are supposed to be equivalent can fail that test
But no matter how different they look, with different syntaxes, tag names, or assembly models, if their conceptual model is the same, it is possible to transform one implementation model to another
Validation is not sufficient to guarantee complete interoperability
Designing (and precisely defining semantics for) a good vocabulary is challenging, essential, and impossible to do perfectly
You should try to reuse other vocabularies if you can, especially for any horizontal components
Follow the "Golden Rule" for the parts of your vocabulary that you design yourself
Chapter 5 of Document Engineering
"Principles of Service Oriented Integration" Sean McGrath and Fergal Murray
"The strategic implications of Wal-Mart's RFID mandate" David Williams
On Demand Business: The New Agenda for Value Creation, Randall Hancock, Peter Korsten and George Pohle