Integrated Public Use Microdata Series (IPUMS) is the world's largest individual-level population database. IPUMS consists of microdata samples from United States and international census records. The records are converted into a consistent format and made available to researchers through a web-based data dissemination system.
IPUMS provides census and survey data from around the world integrated across time and space. IPUMS integration and documentation makes it easy to study change, conduct comparative research, merge information across data types, and analyze individuals within family and community context.
Over the past 25 years, IPUMS has received 70 federal grants and contracts totaling over $140 million to curate, integrate, and disseminate government-produced data collections. Major funding for these projects has come from the National Institutes of Health, the National Science Foundation, and the Food and Drug Administration. IPUMS includes data produced by a broad range of agencies, including the Census Bureau, the Bureau of Labor Statistics, the National Science Foundation, the National Center for Health Statistics, the Centers for Disease Control, and the National Aeronautics and Space Administration.
In collaboration with 105 national statistical agencies, nine national archives, and three genealogical organizations, IPUMS has created the world’s largest accessible database of census microdata. IPUMS includes almost a billion records from U.S. censuses from 1790 to the present and over a billion records from the international censuses of over 100 countries. IPUMS has also harmonized survey data with over 30,000 integrated variables and 150 million records, including the Current Population Survey, the American Community Survey, the National Health Interview Survey, the Demographic and Health Surveys, and an expanding collection of labor force, health, and education surveys. In total, IPUMS currently disseminates integrated microdata describing 1.4 billion individuals drawn from over 750 censuses and surveys.
In addition to census and survey microdata, IPUMS integrates and disseminates the nation’s most comprehensive database of area-level census data and electronic boundaries describing census geography from 1790 to the present. IPUMS NHGIS includes 366 billion data points and 28 million map polygons describing U.S. Census geographic units. IPUMS Terra archives and disseminates a third class of data: raster data derived from satellite imagery, climate models, and other sources.
The unique service provided by IPUMS is the harmonizing of variable codes and documentation to be fully consistent across datasets. This work rests on an extensive technical infrastructure developed over more than two decades, including the first structured metadata system for integrating disparate datasets. By using a data warehousing approach, IPUMS extracts, transforms, and loads data from diverse sources into a single view schema so data from different sources become compatible. The large-scale data integration from IPUMS makes thousands of population datasets interoperable. IPUMS has created software for consistency checking, automated data cleaning and editing, sampling, disclosure control, database harmonization, metadata creation, and parsing.
IPUMS is a part of the Minnesota Population Center at the University of Minnesota and is directed by Regents Professor Steven Ruggles.
There are a total of nine data projects under the IPUMS name: IPUMS USA, IPUMS CPS, IPUMS International, IPUMS DHS, IPUMS NHGIS, IPUMS Terra, IPUMS Time Use, IPUMS Health Surveys, and IPUMS Higher Ed. All IPUMS data and documentation are available online free of charge.
The Journal of American History described the effort as "One of the great archival projects of the past two decades." The official motto of IPUMS is "use it for good, never for evil."