Cross-sectional data

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Cross-sectional data, or a cross section of a study population, in statistics and econometrics, is a type of data collected by observing many subjects (such as individuals, firms, countries, or regions) at the one point or period of time. The analysis might also have no regard to differences in time. Analysis of cross-sectional data usually consists of comparing the differences among selected subjects.

For example, if we want to measure current obesity levels in a population, we could draw a sample of 1,000 people randomly from that population (also known as a cross section of that population), measure their weight and height, and calculate what percentage of that sample is categorized as obese. This cross-sectional sample provides us with a snapshot of that population, at that one point in time. Note that we do not know based on one cross-sectional sample if obesity is increasing or decreasing; we can only describe the current proportion.

Cross-sectional data differs from time series data, in which the same small-scale or aggregate entity is observed at various points in time. Another type of data, panel data (or longitudinal data), combines both cross-sectional and time series data ideas and looks at how the subjects (firms, individuals, etc.) change over a time series. Panel data differs from pooled cross-sectional data across time, because it deals with the observations on the same subjects in different times whereas the latter observes different subjects in different time periods. Panel analysis uses panel data to examine changes in variables over time and its differences in variables between selected subjects.

In a rolling cross-section, both the presence of an individual in the sample and the time at which the individual is included in the sample are determined randomly. For example, a political poll may decide to interview 1000 individuals. It first selects these individuals randomly from the entire population. It then assigns a random date to each individual. This is the random date that the individual will be interviewed, and thus included in the survey.[1]

Cross-sectional data can be used in cross-sectional regression, which is regression analysis of cross-sectional data. For example, the consumption expenditures of various individuals in a fixed month could be regressed on their incomes, accumulated wealth levels, and their various demographic features to find out how differences in those features lead to differences in consumers’ behavior.


  1. ^ Brady, Henry E.; Johnston, Richard (2008). "The Rolling Cross Section and Causal Distribution" (PDF). University of Michigan Press. Retrieved July 13, 2008.

Further reading[edit]

  • Gujarati, Damodar N.; Porter, Dawn C. (2009). "The Nature and Sources of Data for Economic Analysis". Basic Econometrics (Fifth international ed.). New York: McGraw-Hill. pp. 22–28. ISBN 978-007-127625-2.