Copyright 2006 Robert J. Glushko
Revisiting Accelerating RosettaNet
The Key Ideas of Document Engineering
The Document Engineering Approach
UBL 1.0 Case Study
Arrow Electronics (http://www.arrow.com/) is a distributor for products from Linear Technology (http://www.linear.com/) and other suppliers
Linear is in the process of going from two national distributors to just one
Syncata Corporation (http://www.syncata.com) was engaged to automate the flow of purchase orders and order status between Arrow and Linear as well as the subsequent introduction of orders into Linear's back end systems
Linear's implementation is one of the first deployments of Microsoft's Biztalk Server (http://www.microsoft.com/biztalk/default.asp) Accelerator for RosettaNet (http://www.rosettanet.org)
RosettaNet Partner Interface Processes were 3A4 and 3A7
Linear is planning to link up external partners
Linear hopes to exchange inventory information with Arrow (PIP 4C1)
Documents {and,or,vs} Processes
Overlapping Information Components as "Process Glue"
Pattern Granularity and Abstraction in the Model Matrix
The Document Type Spectrum
Crossing the Data/Document Divide
"Getting to the Middle" of the Model Matrix
Documents are always the result of some process and often the input to another one
This is most evident for transactional documents where patterns of paired document exchange are the building blocks for supply chains, marketplaces, auctions and other business patterns
By understanding the information in the documents, we learn what kinds of processes are possible
By understanding the processes, we learn what kinds of information are needed

Document exchange is the "mother of all patterns" for business models, business processes, and business information:
Business model or organizational patterns: marketplace, auction, supply chain, build to order, drop shipment, vendor managed inventory, etc.
Business process patterns: procurement, payment, shipment, reconciliation, etc.
Business information patterns: catalog, purchase order, invoice, etc. and the components they contain for party, time, location, measurement, etc.


Document-centric analysis – from text processing, publishing, hypertext systems
Data-centric analysis – from database systems design, computer science
Business process analysis – from business strategy, process design and re-engineering
User task analysis – from application and user interface design; generalization of use cases from observation

Many people have contrasted "documents" and "data" and concluded that documents and data cannot be understood and handled with the same terminology, techniques, and tools.
This document vs. data distinction is embedded and reinforced in XML textbooks, technology, and product marketing
And it doesn't always help
Documents are Artifacts or Renditions that combine content, structure and appearance
The goal of document analysis is a model of a document's content and structure that is separate from its presentational characteristics
The optimal prescriptive schema for a set of documents is one that best satisfies the requirements of current and prospective users for carrying out specific tasks with new instances
Finally, one or more stylesheets can be used to assign formatting or rendering characteristics in a consistent manner to any valid document
Goal is to understand and describe the properties and relationships between information components or objects.
This understanding is represented in conceptual models that organize the components efficiently to support a broad range of contexts or applications.
The conceptual model is also typically called a schema, but this is generally meant to be a "database schema" rather than a "document schema"




There is systematic and continuous variation in document instances and types and there is no clear boundary between documents and data
But the traditional tools, terminology, and techniques for analyzing documents and data have made it into a chasm
How do we cross the chasm?
Document Engineering harmonizes the terminology and emphasizes what they have in common rather than highlighting their differences
Identifying the presentational, content, and structural components and defining their relationships to each other
Identifying "good" content components
Designing, describing, and organizing components to facilitate their reuse
Assembling hierarchical document models that organize components according to the requirements of a specific context for information exchange
The document analysis and data modeling approaches focus from the beginning on the structure and content of the "document payload" that will be exchanged – a "bottom up" approach that emphasizes "Does this work from a technical perspective?"
In contrast, business process analysis begins with an abstract or broadly scoped perspective on business activities
Emphasizes "Does this work from a business perspective?"
Inherently a "top down" approach that starts with business models and processes and gets to the "document payloads" only at the end
Task analysis (or user analysis) is the observation of people performing the tasks or use cases when the application or system must support human interfaces and not just other applications
Task analysis and document analysis are closely related; document analysis reveals candidate information components and task analysis reveals rules about their intent and usage.
Task analysis is especially important when few documents or information sources exist because human problems or errors can suggest that important information is missing
We need to achieve both business and technical interoperability – the former is necessary but insufficient for the latter
We need models of the desired business processes and the documents that they will produce and consume at the same level of detail and implementability
This is represented in the Model Matrix as "meeting in the middle"
Document Engineering is a systematic approach for "getting to the middle"
Any Document Engineering project worth doing will involve some set of document types and information components that take part in some set of business processes
Because "no document (or process) is an island" there will always be some point at which the documents and processes you care about will intersect or overlap with some that that you don't care about
We'll call the Context whatever characteristics of the situation that define what is in or out of scope, inside or outside of the boundary in which our solution has to work
A business process pattern implies a set of documents and some regular choreographies of document exchanges
A pattern can be thought of as a typical cluster or configuration of requirements
Selecting an appropriate pattern will help expose the information requirements, rules and constraints for our subsequent document analysis and design
Choosing a pattern suggests which document payloads we'll need to find or design and in which business processes we are likely to deploy them
How we describe context influences what patterns we identify and how we apply them
We've chosen or developed a set of recommended modeling artifacts for each phase of the Document Engineering approach
There is a natural progression that yields some overlap or correlation between them as later artifacts refine or consolidate earlier ones
These artifacts have evolved to optimize the "step size" and to encourage more systematic, traceable, and predictable efforts






"Electronic Health Records: Just around the Corner? Or over the Cliff?" R. Baron, E. Fabens, M. Schiffman and E. Wolf
"Adoption of UBL in Denmark -- business cases and experiences" M. Brun, J. Brown and R. Lohde
"RosettaNet for Intel's Trading Entity Automation" J. Cartwright, J. Hahn-Steichen, J. He and T. Miller
D -- data types and document types
O -- organizational processes
C -- context (types of products or services, industry, geography, regulatory considerations)
U -- user types and special user requirements
M -- models, patterns, or standards that apply
E -- enterprises and eco systems (e.g., trading communities, standards bodies)
N -- the needs (business case) driving the enterprise(s)
T -- technology constraints and opportunities
Assignment 3 - Business Patterns, graded due 3 March
Assignment 4 - Course Project, ungraded description due 13 March
Assignment 5 - UML, ungraded due 13 March