Dimension (data warehouse)
From Wikipedia, the free encyclopedia
| This article is in need of attention from an expert on the subject. WikiProject Databases may be able to help recruit one. |
- For other senses of dimension, see dimension (disambiguation).
In a data warehouse, a dimension is a data element that categorizes each item in a data set into non-overlapping regions. A data warehouse dimension provides the means to "slice and dice" data in a data warehouse. Dimensions provide structured labeling information to otherwise unordered numeric measures. For example, "Customer", "Date", and "Product" are all dimensions that could be applied meaningfully to a sales receipt. A dimensional data element is similar to a categorical variable in statistics.
The primary function of dimensions is threefold: to provide filtering, grouping and labeling. For example, in a data warehouse where each person is categorized as having a gender of male, female or unknown, a user of the data warehouse would then be able to filter or categorize each presentation or report by either filtering based on the gender dimension or displaying results broken out by the gender.
Each dimension in a data warehouse may have one or more hierarchies applied to it. For the "Date" dimension, there are several possible hierarchies: "Day > Month > Year", "Day > Week > Year", "Day > Month > Quarter > Year", etc.
Contents |
[edit] Type of Dimension
- Conformed Dimension
Dimensions are conformed when they are either exactly the same (including keys) or one is a perfect subset of the other. Most important, the row headers produced in the answer sets from two different conformed dimensions must be able to match perfectly.
Conformed dimensions are either identical or strict mathematical subsets of the most granular, detailed dimension. Dimension tables are not conformed if the attributes are labeled differently or contain different values. Conformed dimensions come in several different flavors. At the most basic level, conformed dimensions mean the exact same thing with every possible fact table to which they are joined. The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.[1]
- Junk Dimension
A junk dimension is a convenient grouping of typically low-cardinality flags and indicators. By creating an abstract dimension, these flags and indicators are removed from the fact table while placing them into a useful dimensional framework.[2]
- Degenerated Dimension
A dimension key, such as a transaction number, invoice number, ticket number, or bill-of-lading number, that has no attributes and hence does not join to an actual dimension table. Degenerate dimensions are very common when the grain of a fact table represents a single transaction item or line item because the degenerate dimension represents the unique identifier of the parent. Degenerate dimensions often play an integral role in the fact table's primary key.[3]
- Role-Playing Dimensions
Dimensions are often recycled for multiple applications within the same database. For instance, a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".
[edit] Use of ISO representation terms
When referencing data from a metadata registry such as ISO/IEC 11179, representation terms such as Indicator (a boolean true/false value), Code (a set of non-overlapping enumerated values) are typically used as dimensions. For example using the National Information Exchange Model the data element name would be PersonGenderCode and the enumerated values would be male, female and unknown.
[edit] Relationship to other components of a data warehouse
A data warehouse cube is frequently composed of both dimensions or measures. These can then be placed into dimension and fact tables in a relational database.
[edit] See also
- categorical variable
- data warehouse
- dimension table
- degenerate dimension
- slowly changing dimension
- fact table
- ISO/IEC 11179
- measure (data warehouse)
- metadata
[edit] References
- Kimball, Ralph et al. (1998); The Data Warehouse Lifecycle Toolkit, p17. Pub. Wiley. ISBN 0-471-25547-5.
- Kimball, Ralph (1996); The Data Warehouse Toolkit, p. 100. Pub. Wiley. ISBN 0-471-15337-0.
- Notes
- ^ Ralph Kimball, Margy Ross, The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, Second Edition, Wiley Computer Publishing, 2002. ISDN 0471-20024-7, Pages 82-87, 394
- ^ Ralph Kimball, Margy Ross, The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, Second Edition, Wiley Computer Publishing, 2002. ISDN 0471-20024-7, Pages 202, 405
- ^ Ralph Kimball, Margy Ross, The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, Second Edition, Wiley Computer Publishing, 2002. ISDN 0471-20024-7, Pages 50, 398

