Jump to content

Entity–relationship model

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Klausness (talk | contribs) at 11:43, 7 November 2008 (adding back lists of diagramming tools (minus general-purpose diagram-drawing tools with no actual database design features), since it seemed useful; left in red links for future articles). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

An entity-relationship model (ERM) is an abstract conceptual representation of structured data. Entity-relationship modeling is a relational schema database modeling method, used in software engineering to produce a type of conceptual data model (or semantic data model) of a system, often a relational database, and its requirements in a top-down fashion. Diagrams created using this process are called entity-relationship diagrams, or ER diagrams or ERDs for short. The definitive reference for entity relationship modelling is generally given as Peter Chen's 1976 paper[1]. However, variants of the idea existed previously (see for example A. P. G. Brown[2]) and have been devised subsequently.

Overview

The first stage of information system design uses these models during the requirements analysis to describe information needs or the type of information that is to be stored in a database. The data modeling technique can be used to describe any ontology (i.e. an overview and classifications of used terms and their relationships) for a certain universe of discourse (i.e. area of interest). In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Note that sometimes, both of these phases are referred to as "physical design".

There are a number of conventions for entity-relationship diagrams (ERDs). The classical notation is described in the remainder of this article, and mainly relates to conceptual modeling. There are a range of notations more typically employed in logical and physical database design, including IDEF1x (ICAM DEFinition Language) and dimensional modeling.

Connection

Two related entities
An entity with an attribute
A relationship with an attribute
Primary key
A sample ER diagram

An entity may be defined as a thing which is recognised as being capable of an independent existence and which can be uniquely identified. An entity is an abstraction from the complexities of some domain. When we speak of an entity we normally speak of some aspect of the real world which can be distinguished from other aspects of the real world (Beynon-Davies, 2004).

An entity may be a physical object such as a house or a car, an event such as a house sale or a car service, or a concept such as a customer transaction or order. Although the term entity is the one most commonly used, following Chen we should really distinguish between an entity and an entity-type. An entity-type is a category. An entity, strictly speaking, is an instance of a given entity-type. There are usually many instances of an entity-type. Because the term entity-type is somewhat cumbersome, most people tend to use the term entity as a synonym for this term.

Entities can be thought of as nouns. Examples: a computer, an employee, a song, a mathematical theorem. Entities are represented as rectangles.

A relationship captures how two or more entities are related to one another. Relationships can be thought of as verbs, linking two or more nouns. Examples: an owns relationship between a company and a computer, a supervises relationship between an employee and a department, a performs relationship between an artist and a song, a proved relationship between a mathematician and a theorem. Relationships are represented as diamonds, connected by lines to each of the entities in the relationship.

Entities and relationships can both have attributes. Examples: an employee entity might have a Social Security Number (SSN) attribute; the proved relationship may have a date attribute. Attributes are represented as ellipses connected to their owning entity sets by a line.

Every entity (unless it is a weak entity) must have a minimal set of uniquely identifying attributes, which is called the entity's primary key.

Entity-relationship diagrams don't show single entities or single instances of relations. Rather, they show entity sets and relationship sets. Example: a particular song is an entity. The collection of all songs in a database is an entity set. The eaten relationship between a child and her lunch is a single relationship. The set of all such child-lunch relationships in a database is a relationship set.

Lines are drawn between entity sets and the relationship sets they are involved in. If all entities in an entity set must participate in the relationship set, a thick or double line is drawn. This is called a participation constraint. If each entity of the entity set can participate in at most one relationship in the relationship set, an arrow is drawn from the entity set to the relationship set. This is called a key constraint. To indicate that each entity in the entity set is involved in exactly one relationship, a thick arrow is drawn.

Associative entity is used to solve the problem of two entities with a many-to-many relationship [1].

[2]. Unary Relationships - a unary relationship is a relationship between the rows of a single table.

Alternative diagramming conventions

Two related entities shown using Crow's Foot notation

Chen's notation for entity-relationship modelling uses rectangles to represent entities, and diamonds to represent relationships. This notation is appropriate because Chen's relationships are first-class objects: they can have attributes and relationships of their own.

Alternative conventions, with partly historical meaning are:

Crow's Foot

One alternative notation, known as "crow's foot" notation, was developed independently: in these diagrams, entities are represented by boxes, and relationships by labelled arcs.

The "Crow's Foot" notation represents relationships with connecting lines between entities, and pairs of symbols at the ends of those lines to represent the cardinality of the relationship. Crow's Foot notation is used in Barker's Notation and in methodologies such as SSADM and Information Engineering.

For a while Chen's notation was more popular in the United States, while Crow's Foot notation was used primarily in the UK, being used in the 1980s by the then-influential consultancy practice CACI. Many of the consultants at CACI (including Barker) subsequently moved to Oracle UK, where they developed the early versions of Oracle's CASE tools; this had the effect of introducing the notation to a wider audience, and it is now used in many tools including System Architect, Visio, PowerDesigner, ModelRight, Toad Data Modeler, DeZign for Databases, OmniGraffle, MySQL Workbench and Dia. Crow's foot notation has the following benefits:

  • Clarity in identifying the many, or child, side of the relationship, using the crow's foot.
  • Concise notation for identifying mandatory relationship, using a perpendicular bar, or an optional relation, using an open circle.
  • Shows a clear and concise notation that identifies all classes

ER diagramming tools

While ER model diagrams can be drawn with general-purpose diagramming software, there are a number of specialized database-design tools that allow entity-relationship models to be designed and edited.

See also

Proprietary ER diagramming tools

  • ABACUS: data, process, people, application and infrastructure modelling suite with support for many notations including UML, ERD, BPMN, TOGAF and DoDAF.
  • Casewise Corporate Modeler Suite: The Corporate Modeler Suite provides a modeling toolset and extensible repository
  • DeZign for Databases
  • ER/Studio: ER modeling tool from Embarcadero.
  • ERwin: ERD tool, ability to generate HTML reports.
  • MEGA International: The MEGA Modeling Suite is a repository-based suite of modeling products for enterprise architecture, including E/R diagramming, data modeling, and database design.
  • ModelRight: physical modeling tool - free community edition for MySQL.
  • Oracle Designer: Oracle's CASE tool for designing an information system and generating it
  • PowerDesigner: modeling suite from Sybase which includes Data Architect for constructing or reverse engineering conceptual, logical and physical models with many of the leading RDBMS brands.
  • Rational Rose: software designed for producing UML diagrams.
  • RISE Editor: ER tool for designing information systems and generating database and application layer code (web services).
  • Sparx Enterprise Architect: full UML 2.1 which includes data modeling.
  • SQLyog Enterprise edition (GUI client for MySQL) has a diagramming tool that will generate ER-diagrams at the same time as reading or generating physical database objects and can save in a XML-based format for later use.
  • Toad Data Modeler: ER modeling tool with support for both logical and physical modeling. Includes reverse engineering, SQL generation and report generation features for several db systems.
  • Visio: The Enterprise Architect version supports generating and reverse engineering databases
  • Visual Paradigm: Cross platform UML tool which supports round-trip engineering an ERD with a database.
  • Zim : Entity Relationship Integrated Design

Free software ER diagramming tools

References

  1. ^ The Entity Relationship Model - Toward A Unified View of Data
  2. ^ A. P. G. Brown, Modelling a Real-World System and Designing a Schema to Represent It, in Data Base Description, ed Douque and Nijssen, North-Holland, 1975, ISBN 0-7204-2833-5

Further reading

  • Richard Barker (1990). CASE Method: Tasks and Deliverables. Wokingham, England: Addison-Wesley.
  • Paul Beynon-Davies (2004). Database Systems. Houndmills, Basingstoke, UK: Palgrave
  • Chen, Peter (March 1976). "The Entity-Relationship Model - Toward a Unified View of Data". ACM Transactions on Database Systems. 1 (1): 9–36. doi:10.1145/320434.320440.