Column (database)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
An example of output columns from a Postgres database.

In a relational database, a column is a set of data values of a particular simple type, one value for each row of the database.[1] A column may contain text values, numbers, or even pointers to files in the operating system.[2] Some relational database systems allow columns to contain more complex data types; whole documents, images or even video clips are examples.[3] A column can also be called an attribute.

Each row would provide a data value for each column and would then be understood as a single structured data value. For example, a database that represents company contact information might have the following columns: ID, Company Name, Address Line 1, Address Line 2, City, and Postal Code. More formally, each row can be interpreted as a relvar, composed of a set of tuples, with each tuple consisting of the relevant column and its value, for example, the tuple ('Address 1', '12345 West Example Street').

Field[edit]

The word 'field' is normally used interchangeably with 'column'.[4] However, database perfectionists tend to favor using 'field' to signify a specific cell of a given row.[citation needed]

Row database vs column database[edit]

Relational databases mainly use row-based data storage, but column-based storage can be more useful for many business applications. For example, a column database has faster access to which columns can read throughout the ranging process of a query. Any of the columns are known to serve as an index.

Alternatively, row-based applications process only one record at one time and normally need to access a complete record or two. Column databases have better compression as the data storage permits highly effective compression since the majority of the columns cover only a few distinct values compared to the number of rows.[5]

Furthermore, in a column store, data is already vertically divided. This vertical organization allows operations on different columns to be processed in parallel. If multiple items need to be searched or aggregated, each of these operations can be assigned to a different processor core. Overall, row-based databases in rows need to check read through the obligation is to access data from a few columns. Therefore, requests on a large amount of data can take a lot of time, whereas, in column database tables, this information is kept physically next to each other, knowingly increasing the speed of certain data queries.[6]

Advantages[edit]

The main benefit of keeping data in a column database is that some queries can come really quickly. For instance, if you want to know the average age of all users, you can easily jump to the area where the 'age' data is stored and read just the data needed instead of searching up the age for each record row by row. During querying, columnar storage avoids going over non-relevant data. Therefore, aggregation queries where one only needs to look up subsets of total data develop more quickly, compared to row-oriented databases.[7]

Also, as the data type of each column is alike, better compression occurs when running compression algorithms on each column, which will help queries churn results more quickly.[8]

Disadvantages[edit]

There are many situations where multiple fields from each row will be desired. Column databases are usually not the best option for these types of queries. The more fields that need reading per record, the fewer benefits there are in storing data in a column-oriented fashion. If queries are looking for user-specific values only, row-oriented databases usually perform those queries faster.

Secondly, writing new data could take more time in columnar storage.[9] For instance, if you're inserting a new record into a row-oriented database, you can easily write that in one process. However, if you're inserting a new record into a column database, you need to write to each column one by one. This results as it will take longer time when loading new data or updating many values in a columnar database.[10]

Popular databases[edit]

Some examples of popular databases include:

See also[edit]

References[edit]

  1. ^ The term "column" also has equivalent applications in other, more generic contexts. See e.g., Flat file database, Table (information).
  2. ^ "Columnar databases in a big data environment". dummies.com (Big dummies book). Retrieved 2015-11-05. 
  3. ^ "What is Database Column? - Definition from Techopedia". Techopedia.com. Retrieved 2015-11-05. 
  4. ^ "An introduction to databases". www.ucl.ac.uk. Retrieved 2015-11-05. 
  5. ^ "Introduction to column-oriented databases". 2012-11-30. 
  6. ^ "» SAP HANA Tutorial". saphanatutorial.com. Retrieved 2015-11-05. 
  7. ^ "What's Unique About a Columnar Database? | FlyData". FlyData. Retrieved 2015-11-05. 
  8. ^ "What's So Unique About a Columnar Database?". 2015-02-06. 
  9. ^ "Column-Oriented Database Technologies | DB Best Chronicles". www.dbbest.com. Retrieved 2015-11-05. 
  10. ^ "The Database Decision: A Guide". Data Informed. Retrieved 2015-11-05.