The Five Safes is a framework for helping make decisions about making effective use of data which is confidential or sensitive. It is mainly used to describe or design research access to statistical data held by government agencies, and by data archives such as the UK Data Service.
Two of the Five Safes refer to statistical disclosure control, and so the Five Safes is usually used to contrast statistical and non-statistical controls when comparing data management options.
The Five Safes proposes that data management decisions be considered as solving problems in five 'dimensions': projects, people, settings, data and outputs. These are most commonly expressed as questions, for example:
|Safe projects||Is this use of the data appropriate?|
|Safe people||Can the users be trusted to use it in an appropriate manner?|
|Safe settings||Does the access facility limit unauthorised use?|
|Safe data||Is there a disclosure risk in the data itself?|
|Safe outputs||Are the statistical results non-disclosive?|
These dimensions are scales, not limits. That is, solutions can have a mix of more or fewer controls in each dimension, but the overall solution is 'safe' independent of the particular mix. For example, a public use file available for open download cannot control who uses it, where or for what purpose, and so all the control (protection) must be in the data itself. In contrast, a file which is only accessed through a secure environment with certified users can contain very sensitive information: the non-statistical controls allow the data to be 'unsafe'. One academic likened the process to a graphic equalizer, where bass and treble can be combined independently to produce a sound the listener likes.
The Five Safes concept is associated with other topics which developed from the same programme at ONS, although these are not necessarily implemented. Safe people is associated with 'active researcher management', while safe outputs is linked with principles-based output statistical disclosure control. The Five Safes is associated with the 'Data Access Spectrum': as the non-data controls tend to work together, these are contrasted with data detail reduction to present a linear representation of data access options.
History and terminology
The Five Safes was devised in the winter of 2002/2003 by Felix Ritchie at the UK Office for National Statistics (ONS) to describe its secure remote-access Virtual Microdata Laboratory (VML). It was described at this time as the 'VML Security Model'. This was adopted by the NORC data enclave, and more widely in the US, as the 'portfolio model' (although this is now also used to refer to a slightly different legal/statistical/educational breakdown). In 2012 the framework as was still being referred to as the 'VML security model', but its increasing use among non-UK organisations led to the adoption of the more general and informative phrase 'Five Safes'.
The original framework only had four safes (projects, people, settings and outputs): the framework was used to describe highly detailed data access through a secure environment, and so the 'data' dimension was irrelevant. From 2007 onwards, 'safe data' was included as the framework was used to a describe a wider range of ONS activities. As the US version was based upon the 2005 specification, some US iterations uses have the original four dimensions (eg).
Some discussions, such as the OECD, use the term 'secure' instead 'safe'.
The framework has had three uses: pedagogical, descriptive, and design. The latter is a relatively recent development.
The first significant use of the framework, other than internal administrative use, was to structure researcher training courses at the UK Office for National Statistics from 2003. UK Data Archive, Administrative Data Research Network, Eurostat, Statistics New Zealand, the Mexican National Institute of Statistics and Geography, NORC and the Australian Bureau of Statistics, amongst others, have also used this framework. Most of these courses are for researchers using restricted-access facilities; the Eurostat courses are unusual in that they are designed for all users of sensitive data.
The framework is often used to describe existing data access solutions (e.g. UK HMRC Data Lab, UK Data Service, Statistics New Zealand) or planned/conceptualised ones (e.g. Eurostat in 2011). An early use was to help identify areas where ONS' still had 'irreducible risks' in its provision of secure remote access.
The framework is mostly used for confidential social science data. To date it appears to have made little impact on medical research planning, although it is now included in the revised guidelines on implementing HIPAA regulations in the US, and by Cancer Research UK and the Health Foundation in the UK. It has also been used to describe a security model for the Scottish Health Informatics Programme.
In general the Five Safes has been used to describe solutions post-factum, and to explain/justify choices made, but an increasing number of organisations have used the framework to design data access solutions. For example, the Hellenic Statistical Agency developed a data strategy built around the Five Safes in 2016; the UK Health Foundation used the Five Safes to design its data management and training programmes.
The major design use is in Australia: both the Australian Bureau of Statistics and the Australian Department of Social Service used the Five Safes as an ex ante design tool. In 2017 the Australian Productivity Commission recommended adopting a version of the framework to support cross-government data sharing and re-use.
In 2015 the UK Data Service organized a workshop to encourage data users from the academic and private sectors to think about how to manage confidential research data, using the Five Safes to demonstrate alternative options and best practice.
The UK Data Service has produced a blog and video for the general public about the use of Five Safes in re-using administrative data. Statistics New Zealand produced a non-technical description, as did ONS for Data Privacy Day 2017.
- "UK Data Service". www.ukdataservice.ac.uk. Retrieved 2017-01-25.
- Desai, Tanvi; Ritchie, Felix; Welpton, Richard (2016). "Five Safes: designing data access for research" (PDF). Bristol Business School Working Papers in Economics: Footnote 1.
- "1015.0 - Information Paper: Transforming Statistics for the Future". www.abs.gov.au. Australian Bureau of Statistics. 2016. Retrieved 2017-01-25.
- McEachern, Steve (2015). "Implementation of the Trusted Access Model" (PDF). Australian Data Archive.
- Desai, Tanvi; Ritchie, Felix (2009). "Effective Researcher Management" (PDF). www.unece.org. Eurostat. Retrieved 2017-01-25.
- Ritchie, Felix (2008). "Designing a national model for data access" (PDF). Comparative Analysis of Enterprise (Micro)Data 2008. Retrieved 11 July 2018.
- Ritchie, Felix (2008). "Secure access to confidential microdata: four years of the Virtual Microdata Laboratory" (PDF). Economic and Labour Market Statistics. 2:5: 29–34.
- Lane, Julia; Bowie, Chet; Scheuren, Fritz; Mulcahy, Tim (2009). "NORC Data Enclave:Providing Secure Remote Access to Sensitive Microdata". UNECE/EU Workshop on statistical confidentiality 2009.
- Lane, Julia; Heus, Pascal; Mulcahy, Tim (2008). "Data Access in a Cyber World: Making Use of Cyberinfrastructure" (PDF). Transactions in Data Privacy: 2–16.
- Felix, Ritchie, (2013-01-01). "International access to restricted data: A principles-based standards approach". Statistical Journal of the IAOS. 29 (4). doi:10.3233/sji-130780. ISSN 1874-7655.CS1 maint: extra punctuation (link)
- Volkow, Natalia. "OECD Expert Group For International Collaboration On Microdata Access, Chapter 6. Standardised Application Process For Microdata Access" (PDF). www.oecd.org. OECD. pp. 73–79. Retrieved 2017-01-25.
- "Self-study material for the users of European microdatasets". ec.europa.eu. European Commission. Retrieved 2017-01-25.
- Hawkins, Mike (2011). "The HMRC Datalab". slideserve.com. Retrieved 2017-01-25.
- "The 5 safes of access to confidential data". www.ukdataservice.ac.uk. UK Data Service. Retrieved 2017-01-25.
- Camden, Mike (2011). "Confidentiality for integrated data" (PDF). www.unece.org. Eurostat. Retrieved 2017-01-25.
- Bujnowska, Aleksandra; Museux, Jean-Marc (2011). "The Future of Access to European Confidential Data for Scientific Purposes" (PDF). www.unece.org. Eurostat. Retrieved 2017-01-25.
- Ritchie, Felix (2005). "Access to business microdata in the UK: dealing with the irreducible risks" (PDF). UNECE/Eurostat Workshop on Statistical Data Confidentiality 2005.
- Green, Elizabeth (2015). et al. "Enabling data linkage to maximise the value of public health research data" (PDF). Public Health Research Data Forum Commissioned Reports. Wellcome Trust.
- Council, National Research (2014-01-09). Proposed Revisions to the Common Rule for the Protection of Human Subjects in the Behavioral and Social Sciences. doi:10.17226/18614. ISBN 9780309298063.
- Wolters, Arne (2015). "Governance and the HSCIC's IG toolkit" (PDF). ukdataservice.ac.uk. Retrieved 2017-01-25.
- Sullivan, Frank. "The Scottish Health Informatics Programme". www.rss.org.uk. Retrieved 2017-01-25.
- Green, Elizabeth; Ritchie, Felix (2016). "Department of Social Services data access project final report. Project Report".
- Data Availability and Use: Australian Productivity Commission Inquiry Report. Productivity Commission. 2017. ISBN 978-1-74037-617-4.
- Welpton, Richard; Corti, Louise. "Access to sensitive data for research: the five safes". blog.ukdataservice.ac.ukpublisher=UK Data Service. Retrieved 2017-01-25.
- "Five Safes video". www.youtube.com. UK Data Service. Retrieved 2017-01-25.
- "How we keep IDI data safe". www.stats.govt.nzpublisher=Statistics New Zealand. Retrieved 2017-01-25.
- Stokes, Pete (2017). "The Five Safes: data privacy at ONS". blog.ons.gov.uk. Office for National Statistics. Retrieved 2017-01-28.