Copyright © 2006 Robert J. Glushko
Who Are We, and Why Are We Here?
DE and IA in the news
Introduction to Key Concepts
Syllabus Overview and Administrivia
Instructor: Bob Glushko
Teaching Assistant: Mano Marks
The Rest of You?
This course introduces the discipline of Document Engineering: specifying, designing, and deploying electronic documents and information repositories that enable document-centric applications. These applications include web services, virtual enterprises, information supply chains, single-source publishing, and syndication.
Document Engineering has much in common with the field of Information Architecture, but extends its scope beyond web site and web application design.
All stories are from the last month or two
What are the common issues and themes?
Wall Street Journal, 2 November 2005 (wsj.com)
US Food and Drug Administration is requiring all new and changed labels and package inserts to be submitted in XML "Structured Product Labeling" format instead of in PDF
All will be available at http://dailymed.nlm.nih.gov/dailymed/about.cfm
What are the benefits of a standard format in XML?
Wall Street Journal, 17 January 2006 (wsj.com)
http://www.salesforce.com/appexchange
Salesforce has radically transformed the CRM business from being application-oriented to services-oriented
Is now further transforming into a "services platform" -- calling it a "marketplace for online software" -- "the iTunes Music Store of enterprise applications"
CIO, 15 January 2006
http://www.cio.com/archive/011506/pharma.html
The path that drugs take from manufacture to consumption can be very complex and opaque
How can we better "track and trace" drugs in the supply chain?
Borneo Bulletin, 12 January 2006
http://www.brunei-online.com/bb/thu/jan12w11.htm
74 shipping containers with aid for tsunami survivors stuck in port of Belawan in Sumatra (other side of island from Bandar Aceh) because organizations sending the aid didn't complete all the shipping and customs documents
NY Times, 11 November 2005
Dept of Health and Human Services awarded $18.6 M to 4 groups of companies doing pilot projects in automating patient records and linking doctor's offices, clinics, and hospitals using open standards
70 teams competed for these contracts (winning teams led by Accenture, CSC, IBM, and Northrop Grumman)
Financial Times, 16 January 2006
http://news.ft.com/cms/s/86e29848-8635-11da-bee0-0000779e2340.html
In contrast, UK is making pretty good progress with govt-funded initiative
Enormous amounts of existing (paper) documents and legacy system functionality would benefit from automation, process re-engineering, transformation to SOA
New business processes are created / coordinated / choreographed via the management and exchange of electronic documents
Standards / patterns for documents and business processes are essential
Information technology and business processes are co-evolving with many ways to create business value
But projects can be challenging, and not enough people with the needed skills for all the work
XML is a useful technology for Document Engineering, but using XML doesn't make you a document engineer
The best thing about XML is the ease with which you can create a new vocabulary for a particular type of document
XML is just the syntax in which we encode document models... what really matters is how we modeled the documents
The worst thing about XML is the same as the best thing – the ease with which you can create a new vocabulary
No way around the classical problems of classification and naming we know from philosophy, linguistics, cognitive psychology, and information science
XML is NOT "self-describing"
The same content will inevitably be described using different names, and different content will be given the same names
There are often multiple vocabularies for the same or related domains and especially for the common information models that are used in more than one domain
Businesses have long dealt with each other by exchanging documents
Halfat's clay pot receipt for taxes is certainly one of the oldest documents that record a business transaction (355 BCE)

Very natural thing to do
the simplest case is "here's my catalog, do you want to buy anything" and the exchanged document being "here's my order"
We use concepts like "supply chains" and "distribution channels" as metaphors for the coordinated or choreographed flow of information and materials/products between businesses
These are complex patterns composed from the document exchange pattern
Document exchange is the "mother of all patterns" for business models, business processes, and business information
Business model or organizational patterns: marketplace, auction, supply chain, build to order, drop shipment, vendor managed inventory, etc.
Business process patterns: procurement, payment, shipment, reconciliation, etc.
Business information patterns: catalog, purchase order, invoice, etc. and the components they contain for party, time, location, measurement, etc.
Web services is today's biggest buzzword
The idea is simple – encapsulate or "wrap" some specific and discrete unit of functionality to hide its implementation and make it reusable by sending it an XML message, to which it replies with an XML message
Many business patterns like supply chains or virtual enterprises are a natural fit for web services, easy to see idea of service composition
But exchanging information does no good if the information can't be understood by the parties (or applications) doing the exchanging.
The Web services "standards" not only don't solve this problem – they completely ignore it
We've just touched on almost every topic in the course
Models of Business Organization and Business Processes
Models of Business Information; XML Vocabularies
Models of Business Architecture; Web Services
Analyzing and Modeling Business Processes, Documents and Information Components [ABOUT 10 LECTURES]
Model-Based Applications and User Interfaces; UI Design Patterns
Management and Strategy Issues, Case Studies
Glushko & McGrath, Document Engineering, MIT Press, 2005.
All readings will be available online or as paper handouts
7 assignments throughout the semester, 5 of which will be graded. These assignments are designed to develop and reinforce practical skills in analysis, modeling, and implementation of document-centric and model-based applications
Students taking the class for a letter grade will also be required to carry out a "mini-project" during the second half of the semester working in teams of 2 or 3.
Students taking the class S/U will not be required to do the mini-project but will instead serve as reviewers or consultants for other mini-projects.
There is no final exam or midterm.
Assignments are 50% of final grade
Some of your assignments will be created in XML and turned in as HTML to preserve and extend the skills you are assumed to already have
Individual assignments teach separate skills that you'll bring together in a 2-4 person team project
Last half of the semester; pick project by 1 March, incremental reports up to final presentation and report at semester end
One possibility: Requirements through core schema and rudimentary project plan, but completely negotiable and you might be able to do a single project in 213/mix-remix/DE+IA
30% of final grade
For SIMS 2007ers, could be incubator for MIMS project (or summer internship, or 2006-7 GSR appt.)
2002-2003
Course Approval System -- analysis and redesign of system by which new courses are born (primary clients: Academic Senate, IS&T)
2003-2004
System Map -- interactive inventory and visualization of campus IT systems, precursor to campus-wide Data Dictionary (primary client: Central Computing Services [Shel Waggener])
Digital Chemistry -- data modeling to enable content sharing across delivery platform (primary client: Chemistry department [Mark Kubinec])
Event Calendar Network -- replace hodge-podge on calendars that can't share events with repository and syndication/reuse network (primary client: public affairs [Jeff Kahn])
Role-based Access Control -- single-sign on for campus systems, with field-level access control and dynamic assembly of form user interfaces (primary client: graduate division [Chris Hoffman])
Center in a Box -- dynamic assembly of web site for Centers, all of which have (or ought to have) the same data model (mission, people, projects, publications, news, events, resources)
2004-2005
Syllabus project -- common data model for all syllabi to enable dynamic generation of custom and aggregated views (primary client: SIMS)
2005 Class Projects
redesign of Center in a Box
Genentech process control documents
peer market
generic model of content management
personal health record
Class Participation: 20% (in class, blog, list serve)
Document Engineering is practical but also intellectually challenging
I'm not a formal person and will be as accessible as I can to all of you – my official office hours are proposed as M 11-12 & Tu 2-3
But my informality doesn't mean I'm casual about what goes on in my class.
Sylvia is familiar to SIMS students
Sign up for "is243" list server
e-mail to majordomo@sims.berkeley.edu
Subject: Leave blank
Body of message: subscribe is243
part of Chapter 16 of Document Engineering [Textbook, 554-571]
"Accelerating RosettaNet" Burgert, E-Commerce World (November 2001) [Online]
Paperless Trading: Benefits to APEC, Australian Department of Foreign Affairs and Trade (2001) [Online]
"HIT and MIS: Implications of Health Information Technology and Medical Information Systems" P. Goldschmidt Communications of the ACM (October 2005) [Online, 69-74]
Find a story (or advertisement) that involves Document Engineering
Post a bibliographic citation or URL to the course blog
Write a sentence or two about the story, highlighting some Document Engineering aspect
If you can't find a story on your own, read one that someone else has posted and add some insightful comment
Do this by next Monday before class