Protected health information
Protected health information (PHI) under the US law is any information about health status, provision of health care, or payment for health care that is created or collected by a Covered Entity (or a Business Associate of a Covered Entity), and can be linked to a specific individual. This is interpreted rather broadly and includes any part of a patient's medical record or payment history.
PHI is often sought out in datasets for de-identification before researchers share the dataset publicly. Researchers remove individually identifiable PHI from a dataset to preserve privacy for research participants.
Under the US Health Insurance Portability and Accountability Act (HIPAA), PHI that is linked based on the following list of 18 identifiers must be treated with special care:
- All geographical identifiers smaller than a state, except for the initial three digits of a zip code if, according to the current publicly available data from the U.S. Bureau of the Census: the geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and the initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000
- Dates (other than year) directly related to an individual
- Phone Numbers
- Fax numbers
- Email addresses
- Social Security numbers
- Medical record numbers
- Health insurance beneficiary numbers
- Account numbers
- Certificate/license numbers
- Vehicle identifiers and serial numbers, including license plate numbers;
- Device identifiers and serial numbers;
- Web Uniform Resource Locators (URLs)
- Internet Protocol (IP) address numbers
- Biometric identifiers, including finger, retinal and voice prints
- Full face photographic images and any comparable images
- Any other unique identifying number, characteristic, or code except the unique code assigned by the investigator to code the data
De-identification versus anonymization
Anonymization is a process in which PHI elements are eliminated or manipulated with the purpose of hindering the possibility of going back to the original data set. This involves removing all identifying data to create unlinkable data. De-identification under the HIPAA Privacy Rule occurs when data has been stripped of common identifiers by two methods:
- 1. The removal of 18 specific identifiers listed above (Safe Harbor Method)
- 2. Obtain the expertise of an experienced statistical expert to validate and document the statistical risk of re-identification is very small (Statistical Method).
De-identified data is coded, with a link to the original, fully identified data set kept by an honest broker. Links exist in coded de-identified data making the data considered indirectly identifiable and not anonymized. Coded de-identified data is not protected by the HIPAA Privacy Rule, but is protected under the Common Rule. The purpose of de-identification and anonymization is to use health care data in larger increments, for research purposes. Universities, government agencies, and private health care entities use such data for research, development and marketing purposes.
In general, US law governing PHI applies to data collected in the course of providing and paying for health care. Privacy and security regulations govern how healthcare professionals, hospitals, health insurers and other Covered Entities use and protect the data they collect. It is important to understand that the source of the data is as relevant as the data itself when determining if information is PHI under U.S. law. For example, sharing information about someone on the street with an obvious medical condition such as an amputation is not restricted by US law. However, obtaining information about the amputation exclusively from a protected source, such as from an electronic medical record, would breach HIPAA regulations.
Covered Entities often use third parties to provide certain health and business services. If they need to share PHI with those third parties, it is the responsibility of the Covered Entity to put in place a Business Associate Agreement that holds the third party to the same standards of privacy and confidentiality as the Covered Entity.
- Definitions of a Covered Entity
- "De-identification of Protected Heath Information". HIPAA Journal. 2018.
- Ohm, Paul (August 2010). "Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization". UCLA Law Review. 57 (6): 1701–1777.
- "Encouraging the Use of, and Rethinking Protections for De-Identified (and "Anonymized") Health Data" (PDF). Center for Democracy and Technology. June 2009. Retrieved June 12, 2014.
- "HIPAA: What? De-identification of Protected Health Information (PHI)". HIPAA Research Guide. University of Wisconsin-Madison. August 26, 2003. Retrieved June 12, 2014.