|
Imago
Assignment 3
Imago
Metadata [Excel Spreadsheet]
Rationale
for Metadata Scheme
Link
to printable version
1.0 Introduction
We defined our Faceted Monohierarchical Metadata scheme using a
top down design process. Initially, we established a framework at
a generic level for the types of metadata that might be appropriate
to describe photos. Then, we established the objectives and boundaries
of the types of applications that might be consumers of the metadata
scheme, consistent with the assignment. This led us to some organizing
principles to use in designing and refining our classification scheme.
This document parallels the design process in order to establish
the framework and constraints of the metadata scheme. This allows
us to describe the major design decisions we confronted, and to
establish the context for these when we provide the rationale for
the final scheme itself. We then conclude with some comments on
what we see as the strengths and weaknesses of our design.
2.0 General Framework
Prior to designing the specifics of our photo metadata scheme,
we examined the problem in generic terms. We proposed that photo
metadata falls into three general classes of facets:
1) Content
2) Context
3) Qualities (or Form)
Let us examine each in turn.
2.1 Content
This is the content of the photo itself, indicating what the picture
is about. The general facets describing content would include: People,
Location, Event, and Thing.
We discussed and rejected adding more abstract content facets for
our general framework, to capture such concepts as emotion, truth,
beauty, and the like. We elected to stick to the more concrete view
of photo content.
2.2 Context
We described context as the information describing the circumstances
in which the photo was taken. For example, the photo was taken on
a specific date and time, most likely by a specific person, and
at a certain resolution. As digital cameras become more sophisticated
over time, metadata may be automatically generated when a photo
is taken. This might include such information as the GPS coordinates
of the camera and compass direction the camera was facing, as well
as camera model and mode information.
Some general facets describing the context of the photo when it
was created might include: who took the photo, date/time, GPS location,
camera model, resolution, and color/grayscale planes.
2.3 Qualities (or Form)
These facets describe the qualities of the photo itself, such as
the dominant color in the photo, or the "mood" of the
photo. Such metadata can be useful in design contexts, such as when
a designer is trying to find a "somber green" photo of
flowers to fit into an advertisement. Qualities might include dark/light,
dominant color(s), portrait/action shot/still life, and mood.
3.0 Application Needs and Constraints
Given this general framework, the Imago team then considered the
uses of the metadata scheme for this assignment.
The applications in this assignment, including but not limited
to our own, are to depict "life at SIMS" through photos.
This objective bounds the design of our faceted metadata scheme.
Further, to be useful in this context, the scheme should be kept
simple and to the point, and not grow too elaborate.
When embarking on a design, it is also important to consider what
the design is NOT intended to be. With this metadata design, we
are not attempting to design a general-purpose categorization scheme
for all possible images.
We are describing photos taken by SIMS students with digital cameras.
We are NOT considering non-photo images, such as "screen shots."
Similarly, we are not considering addressing the complexity that
arises for images that have been edited by applications such as
PhotoShop, where you can have mixtures of origin information and
the like.
In particular, we decided that our scheme would focus on the Content
facets of the photos. Most of the Qualities facets of the photos
would not be particularly useful for the targeted application set,
and hence would add to the complexity of the metadata design without
contributing much value. Additionally, we focused on the Context
facets that are here today, as opposed to hypothetical future features
(such as GPS latitude and longitude stamps), and which did not seem
obscure or obtuse.
These design decisions led us to the following top-level facets:
People
Place
Event
Thing
Date_Time
Sequence
Resolution
Photo Type
Environment
Owner
As we worked through these, we concluded that Thing was too general
a category; if we attempted to categorize every type of thing in
the world we were going well beyond the scope of the project. Further,
life at SIMS is not primarily about things, and hence would not
enhance our primary objective of being useful for the targeted class
of applications. So, we elected to handle photos of "things"
through the Photo Type facet, and eliminated the top-level "thing"
facet. (We will discuss this in more detail below, when we discuss
the details of the Photo Type facet.)
When we explored the People category (in particular), we found
that affiliation would serve well as an additional, complimentary
facet. Affiliation (e.g. SIMS, UC Berkeley, etc.) repeatedly appeared
in our initial attempts at building a monohierarchical classification.
Breaking it out as a separate facet simplified our design significantly.
These additional design decisions thus led us to the following
refinement of our top-level facets:
People
Affiliation
Place
Event
Date_Time
Sequence
Resolution
Photo Type
Environment
Owner
3.1 Additional Design Principles
As the class of applications we are targeting to support with our
classification scheme are SIMS and UC Berkeley oriented, we adopted
a parallel approach for designing the hierarchies within each top-level
facet. Thus, for example, we did not classify Place as using a classic
World:NorthAmerica:USA:California:etc. style scheme. Rather, we
chose an "inside-out" approach specifying U.C. Berkeley,
Berkeley-Non-UC, and Non-Berkeley as our 1st level sub-facets.
Restating this principle more generally, we defined our classification
scheme using a SIMS-centric approach. We adopted the analogy of
the "Berkeley's View of the World" map, which was mentioned
in class.
Using this principle, for example, when expanding the tree under
the Place facet, South Hall was put on the same level with the remainder
of U.C. Berkeley, which in turn was put on par with the remainder
of Berkeley, which in turn was put on par with the rest of the world.
Another principle was to keep our scheme simple and small, so that
it would be usable by students and others without being unwieldy
or requiring training to use.
Finally, we adopted the principle of frequency-of-use. Simply stated,
include categories that are used frequently, and consolidate or
omit the others. At this early point in the evolution of the image
library, this principle might better be described as expected-frequency-of-use
(or, perhaps, our-best-guess-at-frequency-of-use).
In summary, we established constraints around our metadata scheme
by considering the application set and users of this scheme. We
elaborated our design using the principles of (1) usefulness for
the anticipated application set and users, (2) SIMS-centricity,
(3) simplicity and brevity, and (4) frequency-of-use.
4.0 Discussion of the Metadata Scheme
Now that we have established the context, let us examine the final
metadata facets and sub-facets we have defined. Please also refer
to the accompanying Excel spreadsheet depicting our full metadata
scheme.
4.1 People
We have enumerated the people at SIMS. (If the person in the photo
is not among these, then one of the "unknown" sub-categories
of the People facet is assigned.) We also allowed for a photo to
be described as a "group" of people. Here we applied our
SIMS-centric lens to the definition of this category.
4.2 Affiliation
The affiliation facet should be highly effective when used in conjunction
other facets, in particular with the People facet. We again applied
our SIMS-centric lens, as well as the frequency-of-use principle,
when defining the subcategories under this facet.
4.3 Place
Clearly the guiding principle for the Place facet is SIMS-centricity.
We were careful to specify places in a mutually exclusive manner
- specifically, UCBerkeley vs. Berkeley-nonUCB vs. nonBerkeley.
This avoids ambiguity as we are defining our monohierarchical tree.
There may be some errant examples here, such as a U.C. Berkeley
Forensics lab located in Richmond, California (UCBerkeley and nonBerkeley).
However, these were seen as obscure and unlikely to be a relevant
factor in "depicting life at SIMS". We could not justify
introducing the complexity to handle such cases; our guiding design
principles here were simplicity, SIMS-centricity, and frequency-of-use.
4.4 Event
For the Event facet, we used the principles of SIMS-centricity
and frequency-of-use in defining our subcategories. In addition,
we did not try to enumerate a very large number of events, as that
would make the scheme unwieldy and violate our simplicity principle.
4.5 Owner
We included the "Owner" facet to represent the person
who took the photo or is the
owner of the photo. For example, there may be a shot of David and
his children on the South Hall steps that he had a passerby (e.g.
Phil Walz) take, and David is therefore listed as the "owner"
of the photo. Note that "unknown" is considered a valid
Owner. We felt that this Facet was justified by our usefulness principle.
4.6 Date_Time, Environment, and Resolution
A date and time stamp indicating when the photo was taken was considered
a potentially very useful feature. Similarly, a photo's environment
(indoor or outdoor) was also seen as a useful subdivision.
Photo resolution seemed generally useful, as an application may
have minimum or maximum size constraints based on available screen
real-estate or other user interface considerations. We elected to
enumerate the standard resolutions, based on the frequency-of-use
principle, and let non-standard sizes fall into the "other"
subcategory.
4.7 Sequence
The team felt that it would be important to preserve the sequence
in which a series of photos is taken. Photos may be taken in a particular
sequence to tell a story or convey particular information. To address
this, we incorporated the Sequence facet.
The remainder of the metadata associated with the photo can be
used to establish a de-facto sequence ID corresponding to the sequence,
and the "sequence numbers" then put the photos in order.
(The "number space" of the photos' metadata is sufficiently
large that conflicts, while possible, are exceedingly unlikely.
Hence we erred on the side of simplicity and did not attempt to
enumerate a large number of Sequence ID's as a subcategory of the
Sequence facet.)
4.8 Photo Type
Finally, the Photo Type facet was defined as follows:
Photo Type:Static:PortraitOf Person
Photo Type:Static:Object
Photo Type:Action
This allowed us to capture the notion of Portraits and Action photos,
which we had grouped as Quality (or Form) facets in our original
framework. These may be important aspects of photos for certain
applications (e.g. profile pages or home pages may want a "Portrait"
of "John", as opposed to the 20 other pictures of John).
As noted earlier in the document, the Object classification was
introduced to this hierarchy as an expedient and simple way of handling
photos of objects, allowing us to eliminate the entire top-level
Thing category in our original framework. We concluded building
a categorization scheme for "things" was excessive for
our targeted range of applications. We made this decision, as it
was consistent with our SIMS-centricity and our simplicity principles.
However, this was the most challenging point during our entire design
process.
5.0 Strengths and Weaknesses
The main strengths of our design are its simplicity and usefulness
for the task at hand. It is not just a theoretical exercise. It
is a practical scheme. Given the boundaries we established based
on the application set and on the objectives of the assignment,
our scheme should allow for the photos to be used and reused easily
in a variety of likely applications.
The most pronounced difficulty in our metadata scheme is that it
handles photos of objects extremely generally. Essentially, they
all wind up categorized as "Photo Type:Static:Object"
without additional explanation of the type of object.
We definitely did not want to go down the path of trying to establish
a top-down hierarchy for everything from hamburgers to fine cigars,
and herbaceous plants to pot-bellied pigs. Such a Sisyphean pursuit
would go against all of our stated design principles.
Earlier we gave the rationale that "life at SIMS is not primarily
about things". However, we realize that this is an oversimplification
of the issue, and foresee the possibility that some applications
may want to retrieve or display "objects" that have been
captured in the photos.
Indeed, while brevity is a stated and desirable design goal, we
also see it as the greatest general weakness to our overall scheme.
It is a "double-edged sword." While the overall scheme
holds together cogently, we expect that there will be a tendency
for certain paths to be used substantially more often than others
within a given facet. Expressed a different way, the frequency distribution
of the use of the paths will not be even.
Our approach to addressing this general weakness is to apply our
frequency-of-use principle, and examine the most frequently used
paths to identify the problem areas. Where appropriate, we can then
address these by dividing a sub-category or by expanding the next
sub-level of the hierarchy (e.g. Photo Type:static:object:xxx) to
allow further discrimination of photos for both classification and
retrieval. Conversely, very infrequently used paths that are not
providing much utility could be consolidated (e.g. into Photo Type:static:other
objects).
For addressing the weaknesses our scheme has in handling objects,
in particular, we speculate that this would be an effective approach.
There are a finite number of objects that would be commonly associated
with "life at SIMS" and these would likely fall into a
Zipfian distribution. We would expect the same to be true for Places
and Events.
However, this approach is empirical. If we attempt to enumerate
them at this point in time, with no experience or feedback from
using our metadata scheme, we would only be guessing. Our assessment
is that this would only reduce the quality of the initial version
of our metadata scheme. So our solution is to gain some practical
experience with the applications and the photos, to see what is
really used, and to iterate on the design to expand on our classification
scheme.
|