Generalization Hierarchies
Up to this point, we have discussed
describing an object, the entity, by its shared
characteristics, the attributes. For example, we
can characterize an employee by their employee
id, name, job title, and skill set.
Another method of characterizing entities is
by both similarities and differences. For
example, suppose an organization categorizes the
work it does into internal and external
projects. Internal projects are done on behalf
of some unit within the organization. External
projects are done for entities outside of the
organization. We can recognize that both types
of projects are similar in that each involves
work done by employees of the organization
within a given schedule. Yet we also recognize
that there are differences between them.
External projects have unique attributes, such
as a customer identifier and the fee charged to
the customer. This process of categorizing
entities by their similarities and differences
is known as generalization.
Description
A generalization hierarchy is a structured
grouping of entities that share common
attributes. It is a powerful and widely used
method for representing common characteristics
among entities while preserving their
differences. It is the relationship between an
entity and one or more refined versions. The
entity being refined is called the supertype
and each refined version is called the
subtype. The general form for a
generalization hierarchy is shown in Figure 1.
Generalization hierarchies should be used
when (1) a large number of entities appear to be
of the same type, (2) attributes are repeated
for multiple entities, or (3) the model is
continually evolving. Generalization hierarchies
improve the stability of the model by allowing
changes to be made only to those entities
germane to the change and simplify the model by
reducing the number of entities in the model.
Creating a Generalization Hierarchy
To construct a generalization hierarchy, all
common attributes are assigned to the supertype.
The supertype is also assigned an attribute,
called a discriminator, whose values identify
the categories of the subtypes. Attributes
unique to a category, are assigned to the
appropriate subtype. Each subtype also inherits
the primary key of the supertype. Subtypes that
have only a primary key should be eliminated.
Subtypes are related to the supertypes through a
one-to-one relationship.
Types of Hierarchies
A generalization hierarchy can either be
overlapping or disjoint. In an overlapping
hierarchy an entity instance can be part of
multiple subtypes. For example, to represent
people at a university you have identified the
supertype entity PERSON which has three
subtypes, FACULTY, STAFF, and STUDENT. It is
quite possible for an individual to be in more
than one subtype, a staff member who is also
registered as a student, for example.
In a disjoint hierarchy, an entity instance
can be in only one subtype. For example, the
entity EMPLOYEE, may have two subtypes,
CLASSIFIED and WAGES. An employee may be one
type or the other but not both. Figure 1 shows
A) overlapping and B) disjoint generalization
hierarchy.
Figure 1: Examples of
Generalization Hierarchies

Rules
The primary rule of generalization
hierarchies is that each instance of the
supertype entity must appear in at least one
subtype; likewise, an instance of the subtype
must appear in the supertype.
Subtypes can be a part of only one
generalization hierarchy. That is, a subtype can
not be related to more than one supertype.
However, generalization hierarchies may be
nested by having the subtype of one hierarchy be
the supertype for another.
Subtypes may be the parent entity in a
relationship but not the child. If this were
allowed, the subtype would inherit two primary
keys.
Summary
Generalization hierarchies are a structure
that enables the modeler to represent entities
that share common characteristics but also have
differences.
The next and final step in the modeling
process is to add data Integrity
Rules. |