Business continuity planning
Business continuity planning (or business continuity and resiliency planning) is the process of creating systems of prevention and recovery to deal with potential threats to a company. In addition to prevention, the goal is to permit ongoing operation, before and during execution of disaster recovery.
An organization's resistance to failure is "the ability ... to withstand changes in its environment and still function". Often called resilience, it is a capability that enables organizations to either endure environmental changes without having to permanently adapt, or the organization is forced to adapt a new way of working that better suits the new environmental conditions.
- 1 Overview
- 2 Inventory
- 3 Analysis
- 4 Tiers of preparedness
- 5 Solution design
- 6 Current British standards
- 7 Australia and New Zealand
- 8 Implementation and testing
- 9 Maintenance
- 10 See also
- 11 References
- 12 Further reading
- 13 External links
Any event that could negatively impact operations is included in the plan, such as supply chain interruption, loss of or damage to critical infrastructure (major machinery or computing /network resource). As such, BCP is a subset of risk management. In the US, government entities refer to the process as continuity of operations planning (COOP). A Business Continuity Plan outlines a range of disaster scenarios and the steps the business will take in any particular scenario to return to regular trade. BCP's are written ahead of time and can also include precautions to be put in place. Usually created with the input of key staff as well as stakeholders, a BCP is a set of contingencies to minimize potential harm to businesses during adverse scenarios.
A 2005 analysis of how disruptions can adversely affect the operations of corporations and how investments in resilience can give a competitive advantage over entities not prepared for various contingencies extended then-common business continuity planning practices. Business organizations such as the Council on Competitiveness embraced this resilience goal.
Adapting to change in an apparently slower, more evolutionary manner - sometimes over many years or decades - has been described as being more resilient, and the term "strategic resilience" is now used to go beyond resisting a one-time crisis, but rather continuously anticipating and adjusting, "before the case for change becomes desperately obvious."
Business continuity is the intended outcome of proper execution of Business continuity planning and Disaster recovery. It is the payoff for cost-effective buying of spare machines and servers, performing backups and bringing them off-site, assigning responsibility, performing drills, educating employees and being vigilant.
A major cost in planning for this is the preparation of audit compliance management documents; automation tools are available to reduce the time and cost associated with manually producing this information.
Planners must have information about:
- Supplies and suppliers
- Documents and documentation, including which have off-site backup copies:
- Business documents
- Procedure documentation
The analysis phase consists of
- impact analysis
- threat analysis and
- impact scenarios.
Quantifying of loss ratios must also include "dollars to defend a lawsuit." It has been estimated that a dollar spent in loss prevention can prevent "seven dollars of disaster-related economic loss."
Business impact analysis (BIA)
A Business impact analysis (BIA) differentiates critical (urgent) and non-critical (non-urgent) organization functions/activities. A function may be considered critical if dictated by law.
For each function, two values are assigned:
- Recovery Point Objective (RPO) – the acceptable latency of data that will not be recovered. For example, is it acceptable for the company to lose 2 days of data? The recovery point objective must ensure that the maximum tolerable data loss for each activity is not exceeded.
- Recovery Time Objective (RTO) – the acceptable amount of time to restore the function.
Maximum time constraints for how long an enterprise's key products or services can be unavailable or undeliverable before stakeholders perceive unacceptable consequences have been named as:
- Maximum Tolerable Period of Disruption (MTPoD)
- Maximum Tolerable Downtime (MTD)
- Maximum Tolerable Outage (MTO)
- Maximum Allowable Outage (MAO)
According to ISO 22301 the terms maximum acceptable outage and maximum tolerable period of disruption mean the same thing and are defined using exactly the same words.
Threat and risk analysis (TRA)
After defining recovery requirements, each potential threat may require unique recovery steps. Common threats include:
- Cyber attack
- Sabotage (insider or external threat)
- Hurricane or other major storm
- Power outage
- Water outage (supply interruption, contamination)
- Telecomms outage
- IT outage
- War/civil disorder
- Theft (insider or external threat, vital information or material)
- Random failure of mission-critical systems
- Single point dependency
- Supplier failure
The above areas can cascade: Responders can stumble. Supplies may become depleted. During the 2002-2003 SARS outbreak, some organizations compartmentalized and rotated teams to match the incubation period of the disease. They also banned in-person contact during both business and non-business hours. This increased resiliency against the threat.
Impact scenarios are identified and documented:
- need for medical supplies
- need for transportion options
- civilian impact of nuclear disasters
- need for business and data processing supplies
These should reflect the widest possible damage.
Tiers of preparedness
- Tier 0 - Nothing off-site... "recovery time .. unpredictable ..." - possibly not possible.
- Tier 1 - What IBM calls "PTAM (Pickup Truck Access Method)" - but a hot site (backup hardware).
- Tier 2 - Hot site - will require hours or even days to load the most recent backup tapes.
- Tier 3 - Transaction data at the off-site is kept relatively current via an ongoing high-speed data link (electronic vaulting) and "an automated tape library at the remote site."
- Tier 4 - "Point-in-time copies" so that less reprocessing of transactions will be needed.
- Tier 5 - "Transaction integrity" - the hot site is kept as up-to-the-moment as possible.
- Tier 6 - "Zero or Near-zero data loss"
- Tier 7 - "Highly automated" recovery - few if any manual steps following a main site failure; rollover to running at the hot site is automatic.
Two main requirements from the impact analysis stage are:
- For IT: the minimum application and data requirements and the time in which they must be available.
- Outside IT: preservation of hard copy (such as contracts). A process plant must consider skilled staff and embedded technology.
This phase overlaps with disaster recovery planning.
The solution phase determines:
- crisis management command structure
- telecommunication architecture between primary and secondary work sites
- data replication methodology between primary and secondary work sites
- Backup site - applications, data and work space required at the secondary work site
Current British standards
The British Standards Institution (BSI) released a series of standards:
- BS 7799, peripherally addressed information security procedures.
- 2006: BCP—BS 25999-1,
- 2007: BS 25999-2 "Specification for Business Continuity Management", which specifies requirements for implementing, operating and improving a documented business continuity management system (BCMS).
- 2008: BS25777, specifically to align computer continuity with business continuity. (withdrawn March 2011)
- 2011: ISO/IEC 27031 - Security techniques — Guidelines for information and communication technology readiness for business continuity.
- July 2014: BS EN ISO 22301:2014, the current standard for business continuity planning.
Within the UK, BS 25999-2:2007 and BS 25999-1:2006 used for business continuity management across all organizations, industries and sectors. These documents give a practical plan to deal with most eventualities—from extreme weather conditions to terrorism, IT system failure, and staff sickness.
Civil Contingencies Act
In 2004, following crises in the preceding years, the UK government passed the Civil Contingencies Act of 2004: Businesses must have continuity planning measures to survive and continue to thrive whilst working towards keeping the incident as minimal as possible.
The Act was separated into two parts:
- Part 1: civil protection, covering roles & responsibilities for local responders
- Part 2: emergency powers
Australia and New Zealand
In New Zealand, the Canterbury University Resilient Organisations programme developed an assessment tool for benchmarking the Resilience of Organisations. It covers 11 categories, each having 5 to 7 questions. A Resilience Ratio summarizes this evaluation.
Implementation and testing
The implementation phase involves policy changes, material acquisitions, staffing and testing.
Testing and organizational acceptance
The 2008 book Exercising for Excellence, published by The British Standards Institution identified three types of exercises that can be employed when testing business continuity plans.
- Tabletop exercises - a small number of people concentrate on a specific aspect of a BCP. Another form involves a single representative from each of several teams.
- Medium exercises - Several departments, teams or disciplines concentrate on multiple BCP aspects; the scope can range from a few teams from one building to multiple teams operating across dispersed locations. Pre-scripted "surprises" are added.
- Complex exercises - All aspects of a medium exercise remain, but for maximum realism no-notice activation, actual evacuation and actual invocation of a disaster recovery site is added.
While start and stop times are pre-agreed, the actual duration might be unknown if events are allowed to run their course.
Biannual or annual maintenance cycle maintenance of a BCP manual is broken down into three periodic activities.
- Confirmation of information in the manual, roll out to staff for awareness and specific training for critical individuals.
- Testing and verification of technical solutions established for recovery operations.
- Testing and verification of organization recovery procedures.
Issues found during the testing phase often must be reintroduced to the analysis phase.
The BCP manual must evolve with the organization, and maintain information about who has to know what
- a series of checklists
- job descriptions, skillsets needed, training requirements
- documentation and document management
- definitions of terminology to facilitate timely communication during disaster recovery,
- distribution lists (staff, important clients, vendors/suppliers)
- information about communication and transportation infrastructure (roads,bridges)
Specialized technical resources must be maintained. Checks include:
- Virus definition distribution
- Application security and service patch distribution
- Hardware operability
- Application operability
- Data verification
- Data application
Testing and verification of recovery procedures
Software and work process changes must be (re)documented and validated, including verification that documented work process recovery tasks and supporting disaster recovery infrastructure allow staff to recover within the predetermined recovery time objective.
- Catastrophe modeling
- Crisis management
- Cyber resilience
- Digital continuity
- Disaster recovery
- Disaster recovery and business continuity auditing
- Disaster risk reduction
- Emergency management
- Man-made hazards
- Natural hazards
- Risk management
- Scenario planning
- Systems engineering
- System lifecycle
- "How to Build an Effective and Organized Business Continuity Plan". Forbes. June 26, 2015.
- "Surviving a Disaster" (PDF). American Bar.org (American Bar Association). 2011.
- Elliot, D.; Swartz, E.; Herbane, B. (1999) Just waiting for the next big bang: business continuity planning in the UK finance sector. Journal of Applied Management Studies, Vol. 8, No, pp. 43–60. Here: p. 48.
- Alan Berman (March 9, 2015). "Constructing a Successful Business Continuity Plan". Business Insurance Magazine.
- Ian McCarthy; Mark Collard; Michael Johnson. "Adaptive organizational resilience: an evolutionary perspective". Current Opinion in Environmental Sustainability. 28: 33–40. doi:10.1016/j.cosust.2017.07.005.
- Intrieri, Charles (10 September 2013). "Business Continuity Planning". Flevy. Retrieved 29 September 2013.
- "Guidance & Directives - FEMA.gov".
- "A Guide to the preparation of a Business Continuity Plan" (PDF).
- "Business Continuity Planning (BCP) for Businesses of all Sizes". 19 April 2017.
- Yossi Sheffi (October 2005). The Resilient Enterprise: Overcoming Vulnerability for Competitive Enterprise. MIT Press.
- "Transform. The Resilient Economy".
- "Building A Resilient Nation: Enhancing Security, Ensuring a Strong Economy report" (PDF). Reform Institute. October 2008.
- "Business Continuity Plan". United States Department of Homeland Security. Retrieved 4 October 2018.
- "Emergency Planning" (PDF).
- Helen Clark (August 15, 2012). "Can your Organization survive a natural disaster?" (PDF). RI.gov.
- May, Richard. "Finding RPO and RTO". Archived from the original on 2016-03-03.
- "Maximum Acceptable Outage (Definition)". riskythinking.com. Albion Research Ltd. Retrieved 4 October 2018.
- "BIA Instructions, BUSINESS CONTINUITY MANAGEMENT - WORKSHOP" (PDF). driecentral.org. Disaster Recovery Information Exchange (DRIE) Central. Retrieved 4 October 2018.
- "Plain English ISO 22301 2012 Business Continuity Definitions". praxiom.com. Praxiom Research Group LTD. Retrieved 4 October 2018.
- "Medical supply location and distribution in disasters". SCHOLAR.google.com.
- "transportation planning in disaster recovery". SCHOLAR.google.com.
- "PLANNING SCENARIOS Executive Summaries" (PDF).
- Chloe Demrovsky (December 22, 2017). "Holding It All Together". Manufacturing Business Technology Magazine.
- developed by SHARE's Technical Steering Committee, working with IBM
- Ellis Holman (March 13, 2012). "A Business Continuity Solution Selection Methodology" (PDF). IBM Corp.
- British Standards Institution (2014). Societal security – Business continuity management Systems – Requirements: London
- "ITIL® glossary and abbreviations".
- British Standards Institution (2006). Business continuity management-Part 1: Code of practice :London
- Cabinet Office. (2004). overview of the Act. In: Civil Contingencies Secretariat Civil Contingencies Act 2004: a short. London: Civil Contingencies Secretariat
- "Business Continuity Plan Template".
- Resilient Nation. Demos. April 2009.
- Improving Disaster Resilience. Australian Government. May 12, 2009.
- "Resilient Organisations". March 22, 2011.
- "Resilience Diagnostic". November 28, 2017.
- "Glossary of Business Continuity Terms".
- "Disaster Recovery Plan Checklist" (PDF). CMS.gov.
- Othman. "Validation of a Disaster Management Metamodel (DMM)". SCHOLAR.google.com.
- Business Continuity Planning, FEMA, Retrieved: June 16, 2012
- Continuity of Operations Planning (no date). U.S. Department of Homeland Security. Retrieved July 26, 2006.
- Purpose of Standard Checklist Criteria For Business Recovery (no date). Federal Emergency Management Agency. Retrieved July 26, 2006.
- NFPA 1600 Standard on Disaster/Emergency Management and Business Continuity Programs (2010). National Fire Protection Association.
- United States General Accounting Office Y2k BCP Guide (August 1998). United States Government Accountability Office.
- SPC.1-2009, "Organizational Resilience: Security, Preparedness, and Continuity Management Systems—Requirements with Guidance for Use", approved by American National Standards Institute
International Organization for Standardization
- ISO/IEC 27001:2005 (formerly BS 7799-2:2002) Information Security Management System
- ISO/IEC 27002:2005 (renumerated ISO17999:2005) Information Security Management – Code of Practice
- ISO/IEC 27031:2011 Information technology – Security techniques – Guidelines for information and communication technology readiness for business continuity
- ISO/PAS 22399:2007 Guideline for incident preparedness and operational continuity management
- ISO/IEC 24762:2008 Guidelines for information and communications technology disaster recovery services
- IWA 5:2006 Emergency Preparedness
- ISO 22301:2012 Societal security – Business continuity management systems – Requirements
- ISO 22313:2012 Societal security – Business continuity management systems – Guidance
- ISO/TS 22315:2015 Societal security – Business continuity management systems – Guidelines for business impact analysis (BIA)
- ISO/IEC 27031:2011, "Information security – Security techniques – Guidelines for information and communication technology [ICT] readiness for business continuity"
British Standards Institution
- BS 25999-1:2006 Business Continuity Management Part 1: Code of practice
- BS 25999-2:2007 Business Continuity Management Part 2: Specification
- HB 292-2006, "A practitioners guide to business continuity management"
- HB 293-2006, "Executive guide to business continuity management"
- James C. Barnes. A Guide to Business Continuity Planning. ISBN 978-0471530152.
- Kenneth L Fulmer. Business Continuity Planning, A Step-by-Step Guide. ISBN 978-1931332217.
- Richard Kepenach. Business Continuity Plan Design, 8 Steps for Getting Started Designing a Plan.
- Judy Bell. Disaster Survival Planning: A Practical Guide for Businesses. ISBN 978-0963058003.
- Dimattia, S. (November 15, 2001). "Planning for Continuity". Library Journal: 32–34.
- Andrew Zolli; Ann Marie Healy (2013). Resilience: Why Things Bounce Back. Simon & Schuster. ISBN 978-1451683813.
- International Glossary for Resilience, DRI International.
- The tiers of Disaster Recovery and TSM. Charlotte Brooks, Matthew Bedernjak, Igor Juran, and John Merryman. In, Disaster Recovery Strategies with Tivoli Storage Management. Chapter 2. Pages 21–36. Red Books Series. IBM. Tivoli Software. 2002.
- SteelStore Cloud Storage Gateway: Disaster Recovery Best Practices Guide. Riverbed Technology, Inc. October 2011.
- Disaster Recovery Levels. Robert Kern and Victor Peltz. IBM Systems Magazine. November 2003.
- Business Continuity: The 7-tiers of Disaster Recovery. Recovery Specialties. 2007.
- Continuous Operations: The Seven Tiers of Disaster Recovery. Mary Hall. The Storage Community (IBM). 18 July 2011. Retrieved 26 March 2013.</ref>
- Maximum Tolerable Period of Disruption (MTPOD)
- Maximum Tolerable Period of Disruption (MTPOD): BSI committee response