Pivot table

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In data processing, a pivot table is a data summarization tool found in data visualization programs such as spreadsheets or business intelligence software. Among other functions, a pivot-table can automatically sort, count total or give the average of the data stored in one table or spreadsheet. It displays the results in a second table (called a "pivot table") showing the summarized data. Pivot tables are also useful for quickly creating unweighted cross tabulations. The user sets up and changes the summary's structure by dragging and dropping fields graphically. This "rotation" or pivoting of the summary table gives the concept its name.

The term pivot table is a generic phrase used by multiple vendors. In the United States, Microsoft Corporation has trademarked the specific form PivotTable.[1]

Pivot tables can be seen as a simplification of the more complete and complex OLAP concepts.

History[edit]

In their book Pivot Table Data Crunching,[2] authors Bill Jelen and Mike Alexander refer to Pito Salas as the "father of pivot tables". While working on a concept for a new program which would eventually become Lotus Improv, Salas noted that spreadsheets have patterns of data. A tool that could help the user recognize these patterns would help to build advanced data models quickly. With Improv, users could define and store sets of categories, then change views by dragging category names with the mouse. This core functionality would provide the model for pivot tables.

Lotus Development released Improv in 1991 on the NeXT platform. A few months after the release of Improv, Brio Technology published a standalone Mac implementation called DataPivot (with technology eventually patented in 1999[3]). Borland purchased the DataPivot technology in 1992 and implemented it in their own spreadsheet application Quattro Pro.

In 1993, at the time when the Windows version of Improv appeared, Microsoft Excel 5 was already on the market with a new functionality called a “PivotTable”. This functionality was further improved in later Excel versions:

  • Excel 97 included a new and improved PivotTable Wizard, the ability to create calculated fields, and new pivot cache objects that allow developers to write Visual Basic for Applications macros to create and modify pivot tables.
  • Excel 2000 introduced “Pivot Charts” to represent pivot table data graphically .

Explanation of a pivot table[edit]

For typical data entry and storage, data usually appear in flat tables, meaning that it consists of only columns and rows, as in the following example showing data on shirt types:

Pivottable-Flatdata.PNG

While tables such as these can contain a lot of data, it can be difficult to get summarized information from them. A pivot table can help quickly summarize the data and highlight the desired information. The usage of a pivot table is extremely broad and depends on the situation. The first question to ask is, "What am I looking for?" In the example here, let us ask, "How many Units did we sell in each Region for every Ship Date?":

Pivottable-Pivoted.PNG

A pivot table usually consists of row, column and data (or fact) fields. In this case, the column is Ship Date, the row is Region and the datum we would like to see is (sum of) Units. These fields allow several kinds of aggregations including: sum, average, standard deviation, count etc. In this case, the total number of units shipped is displayed here using a sum aggregation.

How a pivot table works[edit]

Using the example above, software will find all distinct values for Region. In this case, they are: North, South, East, West. Furthermore, it will find all distinct values for Ship Date. Based on the aggregation type, sum, it will summarize the fact and display them in a multidimensional chart. In the example above, the first datum is 66. This number was obtained by finding all records where both Region was East and Ship Date was 1/31/2005, and adding the Units of that collection of records (i.e. cells E2 to E7) together to get a final result.

Pivot tables are not created automatically but one has to first select the entire data in the original table (in MS Excel for example) and then go to Insert -> Pivot table. This will create a new worksheet containing a Pivot Table Field List that will list all the column headers present in the data. For instance, if we have a table that represents sales data of a company, it might include Date of Sale, Sales Person, Item Sold, Color of Item, Units Sold, Per Unit Price, and Total Price. This makes the data easy to access.

Date of Sale Sales Person Item Sold Color of Item Units Sold Per Unit Price Total Price
10/01/13 Harry Notebook Black 8 25000 200000
10/02/13 Larry Laptop Red 4 35000 140000
10/03/13 Harry Mouse Red 6 850 5100
10/04/13 Larry Notebook White 10 27000 270000
10/05/13 Larry Mouse Black 4 800 3200

The Fields that would be created will be visible on the right hand side of the worksheet. By default, below this list, will appear pivot table layout design.

Each of the fields from the list can be dragged on to this layout which has four options :

  1. Report Filter
  2. Column Labels
  3. Row Labels
  4. Summation Values

Report Filter[edit]

Report Filter is used to apply an Excel filter to an entire table. For example, if you drag the "Color of Item" field to this area, then the table constructed will have a report filter inserted above the table. This report filter will have drop-down options (Black, Red, and White in the example above). When you choose an option from this drop-down list ("Black" in this example), then the table that would be visible will contain only the data from those rows that have the "Color of Item = Black".

Column Labels[edit]

Column Labels is used to apply an Excel filter to one or more columns that have to be shown in the pivot table. For instance if we drag the Field "Sales Person" in this area then the table constructed will have a values from the column "Sales Person" i.e. we will have number of columns equal to the number of "Sales Person". There will also be one added column of Total. In the example above, this instruction will create 3 columns in the table - Harry, Larry, and Grand Total. There will be a filter above the data - Column Labels, from which you can select or deselect a particular sales person for the Pivot table.

This table will not have any numerical values as no numerical field is selected but when it is selected, the values will automatically get updated in the column of "Grand Total".

Row Labels[edit]

Row Labels is used to apply an Excel filter to one or more rows that have to be shown in the pivot table. For instance if we drag the Field "Sales Person" in this area then the table constructed will have a values from the column "Sales Person" i.e. we will have number of rows equal to the number of "Sales Person". There will also be one added row of "Grand Total". In the example above, this instruction will create 3 rows in the table - Harry, Larry, and Grand Total. There will be a filter above the data - Row Labels, from which you can select or deselect a particular sales person for the Pivot table.

This table will not have any numerical values as no numerical field is selected but when it is selected, the values will automatically get updated in the Row of "Grand Total"

Summation values[edit]

This usually takes a field that has numerical values that can be used for different types of calculations. However, using text values would also not be wrong and instead of Sum it will give a count. So in the example above, if we drag "Units Sold" to this area then this will along with Row label of "Sales Person" then the instruction will add a new column "Sum of Units Sold" which will have values against each sales person

Row Labels Sum of Units Sold
Harry 14
Larry 18
Grand Total 32

Application support[edit]

Pivot tables are an integral part of a spreadsheet application. In addition to Microsoft Excel, competing software programs such as Apache OpenOffice Calc provide similar functionality; the implementation in Apache OpenOffice and LibreOffice up to release 3.3 is called DataPilot. In version 3.4 of both LibreOffice and OpenOffice, DataPilot is renamed 'Pivot Table'. Other companies such as Quantrix and numberGo provide similar implementations.

Pivot functionality can also be found in other data visualization tools, including business intelligence packages.

Google Docs initially allowed the creation of basic pivot tables via the pivot table gadget from Panorama called Panorama Analytics, but as of 2011 this gadget provided limited functionality and was extremely slow with large amounts of data. In May 2011, Google announced the roll-out of a natively hosted pivot table feature in the Google spreadsheets editor.[4]

ZK, an Ajax framework, also allows the embedding of pivot tables in Web applications.

PostgreSQL, an object-relational database management system, allows the creation of pivot tables using the tablefunc module.[5]

As an OLAP client[edit]

Excel Pivot Tables includes the feature to directly query an OLAP server for retrieving data instead of getting the data from an Excel spreadsheet. On this configuration a pivot table is a simple client of an OLAP server. Excel's Pivot Table not only allows for connecting to Microsoft's solution (Analysis Service), but to any XML for Analysis (XMLA) (OLAP standard) compliant server.

Other OLAP clients are JPivot, Dundas, IcCube (Client Library).

See also[edit]

References[edit]

  1. ^ "United States Trademark Serial Number 74472929". 1994-12-27. Retrieved 2013-02-17. 
  2. ^ Jelen, Bill; Alexander, Michael (2006). Pivot table data crunching. Indianapolis: Que. p. 274. ISBN 0-7897-3435-4. 
  3. ^ Patent #5915257, retrieved 2010-02-16 
  4. ^ Docs Blog: Summarize your data with pivot tables
  5. ^ PostgreSQL: Documentation: 9.2: tablefunc

Further reading[edit]

External links[edit]