Jump to content

Unnormalized form

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 80.62.117.218 (talk) at 10:27, 9 June 2021 (Added a section about UNF as a prerequisite stage to normalization, so the UNF link from the normalization page makes sense. (But I wonder if this should be a separate page?)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In database normalization Unnormalized form (UNF), also known as an unnormalized relation or non first normal form (NF2),[1] is a database data model (organization of data in a database) which does meet any of the conditions of database normalization defined by the relational model. Database systems which support unnormalized data is sometimes called non-relational or NoSQL databases. In the relational model, unnormalized relations can be considered the starting point for a process of normalization. Should not be confused with denormalization where normalization is deliberately compromised for selected tables in a relational database.

History

In 1970, E.F. Codd proposed the relational data model, now widely accepted as the standard data model.[2] At that time, office automation was the major use of data storage systems, which resulted in the proposal of many NF2 data models like the Schek model, Jaeschke models (non-recursive and recursive algebra), and the Nested Table Data (NTD) model.[1] IBM organized the first international workshop exclusively on this topic in 1987 which was held in Darmstadt, Germany.[1] Moreover, a lot of research has been done and journals have been published to address the shortcomings of the relational model. Since the turn of the century, NoSQL databases have become popular owing to the demands of Web 2.0.

As a prerequisite to normalization

The definition of first normal form implies that the normalization happen on relations [3]. In database systems relations are represented as tables. The relation view implies some constraints on the tables:

  • No duplicate rows. In practice this is ensured by defining one or more columns as primary keys.
  • Rows does not have an intrinsic order. While tables have to be stored and presented in some order, this is unstable and implementation dependent. If a specific ordering need to be represented, it has to be in the form of data, e.g. a "number" column.
  • Columns have unique names within the same table.
  • Each column have a domain (or data type) which defines the allowed values in the column.

This definition does not preclude columns having sets or relations as values, e.g. nested tables. This is the major difference to first normal form.


Example

Students take courses:

Id Name Course
1. Jack
  • Mathematics
  • Chemistry
2. Tim Chemistry
3. Ana
  • Physics
  • Chemistry

The above table represent data in unnormalized form as more than one value is stored in a single attribute within a row/tuple. This lack of defined atomicity means that the table has not reached the level of first normal form.[4] Here, one student can take several courses and that data will be reflected in a single row in the database. One can see that relating entities based on 'course' is more problematic, as is ensuring consistency of data (being either impossible or very difficult in a systematic way).

Modern applications

Today, companies like Google, Amazon and Facebook deal with large amounts of data that are difficult to store efficiently. They use NoSQL databases, which are based on the principles of the unnormalized relational model, to deal with the storage issue.[5] Some examples of NoSQL databases are MongoDB, Apache Cassandra and Redis. These databases are more scalable and easier to query with as they do not involve expensive operations like JOIN.[citation needed]

See also

References

  1. ^ a b c Kitagawa, Hiroyuki; Kunii, Tosiyasu L. (1990-02-06). The Unnormalized Relational Data Model. pp. 1, 5, 7, 10. ISBN 978-4-431-70049-4.
  2. ^ "IBM Archives: Edgar F. Codd". April 23, 2003.
  3. ^ Codd, E.F. (1970). A Relational Model of Data for. Large Shared Data Banks. IBM Research Laboratory, San Jose, California.
  4. ^ "Ridiculously Unnormalized Database Schemas – Part One". 2011-01-19. Retrieved 2016-09-14.
  5. ^ Moniruzzaman, A B M; Hossain, Syed Akhter (2013). "NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison". International Journal of Database Theory and Application. 6.