Jump to content

Unnormalized form: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Added a section about UNF as a prerequisite stage to normalization, so the UNF link from the normalization page makes sense. (But I wonder if this should be a separate page?)
Added distinction between non-relational and non-first-normal-form
Line 9: Line 9:
}}</ref> At that time, [[office automation]] was the major use of data storage systems, which resulted in the proposal of many NF<sup>2</sup> data models like the Schek model, Jaeschke models (non-recursive and recursive algebra), and the Nested Table Data (NTD) model.<ref name=":0" /> [[IBM]] organized the first international workshop exclusively on this topic in 1987 which was held in [[Darmstadt|Darmstadt, Germany]].<ref name=":0" /> Moreover, a lot of research has been done and journals have been published to address the shortcomings of the [[relational model]]. Since the turn of the century, [[NoSQL]] databases have become popular owing to the demands of [[Web 2.0]].
}}</ref> At that time, [[office automation]] was the major use of data storage systems, which resulted in the proposal of many NF<sup>2</sup> data models like the Schek model, Jaeschke models (non-recursive and recursive algebra), and the Nested Table Data (NTD) model.<ref name=":0" /> [[IBM]] organized the first international workshop exclusively on this topic in 1987 which was held in [[Darmstadt|Darmstadt, Germany]].<ref name=":0" /> Moreover, a lot of research has been done and journals have been published to address the shortcomings of the [[relational model]]. Since the turn of the century, [[NoSQL]] databases have become popular owing to the demands of [[Web 2.0]].


== Relational form ==
== As a prerequisite to normalization ==


The definition of [[first normal form]] implies that the normalization happen on [[Relation_(database)|relations]] <ref>Codd, E.F. (1970). A Relational Model of Data for. Large Shared Data Banks. IBM Research Laboratory, San Jose, California.</ref>. In database systems relations are represented as tables. The relation view implies some constraints on the tables:
Normalization to [[first normal form]] requires the initial data to be viewed as [[Relation_(database)|relations]] <ref>Codd, E.F. (1970). A Relational Model of Data for. Large Shared Data Banks. IBM Research Laboratory, San Jose, California.</ref>. In database systems relations are represented as tables. The relation view implies some constraints on the tables:
* No duplicate rows. In practice this is ensured by defining one or more columns as [[primary keys]].
* No duplicate rows. In practice this is ensured by defining one or more columns as [[primary keys]].
* Rows does not have an intrinsic order. While tables have to be stored and presented in ''some'' order, this is unstable and implementation dependent. If a specific ordering need to be represented, it has to be in the form of data, e.g. a "number" column.
* Rows does not have an intrinsic order. While tables have to be stored and presented in ''some'' order, this is unstable and implementation dependent. If a specific ordering need to be represented, it has to be in the form of data, e.g. a "number" column.
* Columns have unique names within the same table.
* Columns have unique names within the same table.
* Each column have a domain (or data type) which defines the allowed values in the column.
* Each column have a domain (or data type) which defines the allowed values in the column.
* All rows in a table have the same set of columns.


This definition does not preclude columns having sets or relations as values, e.g. nested tables. This is the major difference to [[first normal form]].
This definition does not preclude columns having sets or relations as values, e.g. nested tables. This is the major difference to [[first normal form]].


[[No sql]] databases like [[Document database|Document databases]] typically does not conform to the relational view. For example an JSON or XML database might support duplicate records and intrinsic ordering. Such database can be described as non-relational. But there are also database models which support the relational view, but does not embrace [[first normal form]].<ref>Operations and the Properties on Non-First-Normal-Form Relational Databases H. Arisawa, K. Moriya, T. Miura
Published in VLDB 1983</ref> Such models are called non-first normal form relations (abbreviated NFR, NFNF or NF<sup>2</sup>).


== Example ==
== Example ==

Revision as of 12:44, 9 June 2021

In database normalization Unnormalized form (UNF), also known as an unnormalized relation or non first normal form (NF2),[1] is a database data model (organization of data in a database) which does meet any of the conditions of database normalization defined by the relational model. Database systems which support unnormalized data is sometimes called non-relational or NoSQL databases. In the relational model, unnormalized relations can be considered the starting point for a process of normalization. Should not be confused with denormalization where normalization is deliberately compromised for selected tables in a relational database.

History

In 1970, E.F. Codd proposed the relational data model, now widely accepted as the standard data model.[2] At that time, office automation was the major use of data storage systems, which resulted in the proposal of many NF2 data models like the Schek model, Jaeschke models (non-recursive and recursive algebra), and the Nested Table Data (NTD) model.[1] IBM organized the first international workshop exclusively on this topic in 1987 which was held in Darmstadt, Germany.[1] Moreover, a lot of research has been done and journals have been published to address the shortcomings of the relational model. Since the turn of the century, NoSQL databases have become popular owing to the demands of Web 2.0.

Relational form

Normalization to first normal form requires the initial data to be viewed as relations [3]. In database systems relations are represented as tables. The relation view implies some constraints on the tables:

  • No duplicate rows. In practice this is ensured by defining one or more columns as primary keys.
  • Rows does not have an intrinsic order. While tables have to be stored and presented in some order, this is unstable and implementation dependent. If a specific ordering need to be represented, it has to be in the form of data, e.g. a "number" column.
  • Columns have unique names within the same table.
  • Each column have a domain (or data type) which defines the allowed values in the column.
  • All rows in a table have the same set of columns.

This definition does not preclude columns having sets or relations as values, e.g. nested tables. This is the major difference to first normal form.

No sql databases like Document databases typically does not conform to the relational view. For example an JSON or XML database might support duplicate records and intrinsic ordering. Such database can be described as non-relational. But there are also database models which support the relational view, but does not embrace first normal form.[4] Such models are called non-first normal form relations (abbreviated NFR, NFNF or NF2).

Example

Students take courses:

Id Name Course
1. Jack
  • Mathematics
  • Chemistry
2. Tim Chemistry
3. Ana
  • Physics
  • Chemistry

The above table represent data in unnormalized form as more than one value is stored in a single attribute within a row/tuple. This lack of defined atomicity means that the table has not reached the level of first normal form.[5] Here, one student can take several courses and that data will be reflected in a single row in the database. One can see that relating entities based on 'course' is more problematic, as is ensuring consistency of data (being either impossible or very difficult in a systematic way).

Modern applications

Today, companies like Google, Amazon and Facebook deal with large amounts of data that are difficult to store efficiently. They use NoSQL databases, which are based on the principles of the unnormalized relational model, to deal with the storage issue.[6] Some examples of NoSQL databases are MongoDB, Apache Cassandra and Redis. These databases are more scalable and easier to query with as they do not involve expensive operations like JOIN.[citation needed]

See also

References

  1. ^ a b c Kitagawa, Hiroyuki; Kunii, Tosiyasu L. (1990-02-06). The Unnormalized Relational Data Model. pp. 1, 5, 7, 10. ISBN 978-4-431-70049-4.
  2. ^ "IBM Archives: Edgar F. Codd". April 23, 2003.
  3. ^ Codd, E.F. (1970). A Relational Model of Data for. Large Shared Data Banks. IBM Research Laboratory, San Jose, California.
  4. ^ Operations and the Properties on Non-First-Normal-Form Relational Databases H. Arisawa, K. Moriya, T. Miura Published in VLDB 1983
  5. ^ "Ridiculously Unnormalized Database Schemas – Part One". 2011-01-19. Retrieved 2016-09-14.
  6. ^ Moniruzzaman, A B M; Hossain, Syed Akhter (2013). "NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison". International Journal of Database Theory and Application. 6.