Cross-sectional data, or a cross section of a study population, in statistics and econometrics is a type of data collected by observing many subjects (such as individuals, firms, countries, or regions) at the same point of time, or without regard to differences in time. Analysis of cross-sectional data usually consists of comparing the differences among the subjects.
For example, if we want to measure current obesity levels in a population, we could draw a sample of 1,000 people randomly from that population (also known as a cross section of that population), measure their weight and height, and calculate what percentage of that sample is categorized as obese. This cross-sectional sample provides us with a snapshot of that population, at that one point in time. Note that we do not know based on one cross-sectional sample if obesity is increasing or decreasing; we can only describe the current proportion.
Cross-sectional data differs from time series data, in which the same small-scale or aggregate entity is observed at various points in time—for example, longitudinal data, which follows one subject's changes over the course of time. Another variant, panel data (or time-series cross-sectional (TSCS) data), combines both and looks at multiple subjects and how they change over the course of time. Panel analysis uses panel data to examine changes in variables over time and differences in variables between subjects.
In a rolling cross-section, both the presence of an individual in the sample and the time at which the individual is included in the sample are determined randomly. For example, a political poll may decide to interview 1000 individuals. It first selects these individuals randomly from the entire population. It then assigns a random date to each individual. This is the random date that the individual will be interviewed, and thus included in the survey.
Cross-sectional data can be used in cross-sectional regression, which is regression analysis of cross-sectional data. For example, the consumption expenditures of various individuals in a fixed month could be regressed on their incomes, accumulated wealth levels, and their various demographic features to find out how differences in those features lead to differences in consumer behavior.
- Brady, Henry E.; Johnston, Richard (2008). "The Rolling Cross Section and Causal Distribution" (PDF). University of Michigan Press. Retrieved July 13, 2008.
- Gujarati, Damodar N.; Porter, Dawn C. (2009). "The Nature and Sources of Data for Economic Analysis". Basic Econometrics (Fifth international ed.). New York: McGraw-Hill. pp. 22–28. ISBN 978-007-127625-2.
|This statistics-related article is a stub. You can help Wikipedia by expanding it.|