Data modeling is a method used to define and analyze data requirements needed to support the business processes of
an organization. The data requirements are recorded as a conceptual data model (also known as a domain model, or an information model) with associated data definitions. Actual implementation of
the conceptual model is called a logical data model. Multiple logical data models might be needed to implement one
conceptual data model [WKIP_DML].
Both [AMB01] and [WKIP_DML] provide brief descriptions of a data modeling process.
Ambler's is summarized as follows:
-
Identify entity types
-
Identify attributes
-
Apply naming conventions
-
Identify relationships
-
Apply data model patterns
-
Assign keys
-
Normalize to reduce data redundancy
-
Denormalize to improve performance
Details for each step of the process are provided in [Ambler].
There is no universally-adopted notation for representing data models. The two most popular notations, that for
the Information Engineering Method [FNK89] and Barker's notation [BRK90], first evolved during the 1970's and 1980's, respectively. Each
uses a variant of the "pitchfork" notation, used to describe multiplicity in entity-to-entity relationships, that
is commonly seen in data model Entity-Relationship diagrams (ERDs). Another notation, IDEF1X, was developed
with the sponsorship of the United States Air Force during the early 1980's [BRC92]. IDEF1X ERDs are characterized by "ball" notation to
denote various multiplicities. Unified Modeling Language (UML) has seen limited penetration into the data
modeling domain, even though UML is richer semantically than any of the other three notations and is able to
represent relationships that the other notations cannot.
[AMB01] provides a useful tabular comparison of the graphical elements provided by each of the
above four notations.
|