Business continuity planning
Business continuity planning (BCP, also called business continuity and resiliency planning BCRP)
identifies an organization's exposure to internal and external threats and synthesizes hard and soft assets to provide effective prevention and recovery for the organization, while maintaining competitive advantage and value system integrity—Elliot et al. 1999
A business continuity plan is a plan to continue operations if a place of business (e.g., an office, work site or data center) is affected by adverse physical conditions, such as a storm, fire or crime. Such a plan typically explains how the business would recover its operations or move operations to another location. For example, if a fire destroys an office building or data center, the people and business or data center operations would relocate to a recovery site.
The plan could include recovering from different levels of disaster which can be short term, localized disasters, to days long building wide problems, to a permanent loss of a building.
Examples: Office building disasters - a roof leaks in an office building, then move the people to another floor in the same building; the building loses electrical power or is flooded, then have people work from home during the outage; the building burns and is completely destroyed, then relocate the people to a recovery site until a new building is acquired.
Computer center disasters - a server breaks, then recover to a backup server; the electrical power is out, then start an emergency generator, the datacenter is destroyed by a fire, then recover to a computer disaster recovery datacenter.
In the US, government entities refer to the process as continuity of operations planning (COOP).
Any event that could negatively impact operations is included in the plan, such as supply chain interruption, loss of or damage to critical infrastructure (major machinery or computing /network resource). As such, risk management must be incorporated as part of BCP.
In December 2006, the British Standards Institution (BSI) released an independent standard for BCP — BS 25999-1. Prior to the introduction of BS 25999, BCP professionals relied on information security standard BS 7799, which only peripherally addressed BCP to improve an organization's information security procedures. BS 25999's applicability extends to all organizations. In 2007, the BSI published BS 25999-2 "Specification for Business Continuity Management", which specifies requirements for implementing, operating and improving a documented business continuity management system (BCMS).
Business continuity management is standardised across the UK by British Standards (BS) through BS 25999-2:2007 and BS 25999-1:2006. BS 25999-2:2007 business continuity management is the British Standard for business continuity management across all organizations. This includes industry and its sectors. The standard provides a best practice framework to minimize disruption during unexpected events that could bring business to a standstill. The document gives you a practical plan to deal with most eventualities – from extreme weather conditions to terrorism, IT system failure and staff sickness.
This document was superseded in November 2012 by the British standard BS ISO22301:2012.
In 2004, following crises in the preceding years, the UK government passed the Civil Contingencies Act 2004 (The Act). This provides the legislation for civil protection in the UK.
The Act was separated into two distinct parts: Part 1 focuses on local arrangements for civil protection, establishing a statutory framework of roles and responsibilities for local responders. Part 2 focused on emergency powers, establishing a modern framework for the use of special legislative measures that might be necessary to deal with the effects of the most serious emergency.
The Act is telling responders and planners that businesses need to have continuity planning measures in place in order to survive and continue to thrive whilst working towards keeping the incident as minimal as possible.
- 1 Analysis
- 2 Solution design
- 3 Implementation
- 4 Testing and organizational acceptance
- 5 Maintenance
- 6 See also
- 7 References
- 8 Further reading
- 9 External links
The analysis phase consists of impact analysis, threat analysis and impact scenarios.
Business impact analysis (BIA)
A Business impact analysis (BIA) differentiates critical (urgent) and non-critical (non-urgent) organization functions/activities. Critical functions are those whose disruption is regarded as unacceptable. Perceptions of acceptability are affected by the cost of recovery solutions. A function may also be considered critical if dictated by law. For each critical (in scope) function, two values are then assigned:
- Recovery Point Objective (RPO) – the acceptable latency of data that will not be recovered
- Recovery Time Objective (RTO) – the acceptable amount of time to restore the function
The recovery point objective must ensure that the maximum tolerable data loss for each activity is not exceeded. The recovery time objective must ensure that the Maximum Tolerable Period of Disruption (MTPoD) for each activity is not exceeded.
Next, the impact analysis results in the recovery requirements for each critical function. Recovery requirements consist of the following information:
- The business requirements for recovery of the critical function, and/or
- The technical requirements for recovery of the critical function
Threat and risk analysis (TRA)
After defining recovery requirements, each potential threat may require unique recovery steps. Common threats include:
The impact of an epidemic can be regarded as purely human, and may be alleviated with technical and business solutions. However, if people behind these plans are affected by the disease, then the process can stumble.
During the 2002–2003 SARS outbreak, some organizations grouped staff into separate teams, and rotated the teams between primary and secondary work sites, with a rotation frequency equal to the incubation period of the disease. The organizations also banned face-to-face intergroup contact during business and non-business hours. The split increased resiliency against the threat of quarantine measures if one person in a team was exposed to the disease.
After defining threats, impact scenarios form the basis of the business recovery plan. In general, planning for the most wide-reaching impact is preferable. A typical impact scenario such as "building loss" encompasses most critical business functions. A BCP may document scenarios for each building. More localized impact scenarios – for example loss of a specific floor in a building – may also be documented.
After the analysis phase, business and technical recovery requirements precede the solutions phase. Asset inventories allow for quick identification of deployable resources. For an office-based, IT-intensive business, the plan requirements may cover desks, human resources, applications, data, manual workarounds, computers and peripherals.
Other business environments, such as production, distribution, warehousing etc. will need to cover these elements, but likely have additional issues.
The solution design phase identifies the most cost-effective disaster recovery solution that meets two main requirements from the impact analysis stage. For IT purposes, this is commonly expressed as the minimum application and data requirements and the time in which the minimum application and application data must be available.
Outside the IT domain, preservation of hard copy information, such as contracts, skilled staff or restoration of embedded technology in a process plant must be considered. This phase overlaps with disaster recovery planning methodology. The solution phase determines:
- crisis management command structure
- secondary work sites
- telecommunication architecture between primary and secondary work sites
- data replication methodology between primary and secondary work sites
- applications and data required at the secondary work site, and
- physical data requirements at the secondary work site.
The implementation phase involves policy changes, material acquisitions, staffing and testing.
Testing and organizational acceptance
The purpose of testing is to achieve organizational acceptance that the solution satisfies the recovery requirements. Plans may fail to meet expectations due to insufficient or inaccurate recovery requirements, solution design flaws or solution implementation errors. Testing may include:
- Crisis command team call-out testing
- Technical swing test from primary to secondary work locations
- Technical swing test from secondary to primary work locations
- Application test
- Business process test
At minimum, testing is conducted on a biannual schedule.
The 2008 book Exercising for Excellence, published by The British Standards Institution identified three types of exercises that can be employed when testing business continuity plans.
Tabletop exercises typically involve a small number of people and concentrates on a specific aspect of a BCP. They can easily accommodate complete teams from a specific area of a business.
Another form involves a single representative from each of several teams. Typically, participants work through simple scenario and then discuss specific aspects of the plan. For example, a fire is discovered out of working hours.
The exercise consumes only a few hours and is often split into two or three sessions, each concentrating on a different theme.
A medium exercise is conducted within a "Virtual World" and brings together several departments, teams or disciplines. It typically concentrates on multiple BCP aspects, prompting interaction between teams. The scope of a medium exercise can range from a few teams from one organisation co-located in one building to multiple teams operating across dispersed locations. The environment needs to be as realistic as practicable and team sizes should reflect a realistic situation. Realism may extend to simulated news broadcasts and websites.
A medium exercise typically lasts a few hours, though they can extend over several days. They typically involve a "Scenario Cell" that adds pre-scripted "surprises" throughout the exercise.
A complex exercise aims to have as few boundaries as possible. It incorporates all the aspects of a medium exercise. The exercise remains within a virtual world, but maximum realism is essential. This might include no-notice activation, actual evacuation and actual invocation of a disaster recovery site.
While start and stop times are pre-agreed, the actual duration might be unknown if events are allowed to run their course.
Biannual or annual maintenance cycle maintenance of a BCP manual is broken down into three periodic activities.
- Confirmation of information in the manual, roll out to staff for awareness and specific training for critical individuals.
- Testing and verification of technical solutions established for recovery operations.
- Testing and verification of organization recovery procedures.
Issues found during the testing phase often must be reintroduced to the analysis phase.
The BCP manual must evolve with the organization. Activating the call tree verifies the notification plan's efficiency as well as contact data accuracy. Types of changes that should be identified and updated in the manual include:
- Important clients
- Organization structure changes
- Company investment portfolio and mission statement
- Communication and transportation infrastructure such as roads and bridges
Specialized technical resources must be maintained. Checks include:
- Virus definition distribution
- Application security and service patch distribution
- Hardware operability
- Application operability
- Data verification
- Data application
Testing and verification of recovery procedures
As work processes change, previous recovery procedures may no longer be suitable. Checks include:
- Are all work processes for critical functions documented?
- Have the systems used for critical functions changed?
- Are the documented work checklists meaningful and accurate?
- Do the documented work process recovery tasks and supporting disaster recovery infrastructure allow staff to recover within the predetermined recovery time objective?
- Catastrophe modeling
- Disaster recovery
- Emergency management
- Natural hazards
- Man-made hazards
- Space accidents and incidents
- Risk management
- Disaster recovery and business continuity auditing
- Systems engineering
- Systems engineering process
- System lifecycle
- Systems thinking
- Resilience (organizational)
- Seven tiers of disaster recovery
- Elliot, D.; Swartz, E.; Herbane, B. (1999) Just waiting for the next big bang: business continuity planning in the UK finance sector. Journal of Applied Management Studies, Vol. 8, No, pp. 43–60. Here: p. 48.
- Intrieri, Charles (10 September 2013). "Business Continuity Planning". Flevy. Retrieved 29 September 2013.
- British Standards Institution (2006). Business continuity management-Part 1: Code of practice :London
- British Standards Institution (2012). Societal security – Business continuity management Systems – Requirements: London
- Cabinet Office. (2004). overview of the Act. In: Civil Contingencies Secretariat Civil Contingencies Act 2004: a short. London: Civil Contingencies Secretariat
- Business Continuity Planning, FEMA, Retrieved: June 16, 2012
- Continuity of Operations Planning (no date). U.S. Department of Homeland Security. Retrieved July 26, 2006.
- Purpose of Standard Checklist Criteria For Business Recovery (no date). Federal Emergency Management Agency. Retrieved July 26, 2006.
- NFPA 1600 Standard on Disaster/Emergency Management and Business Continuity Programs — PDF (2010). National Fire Protection Association.
- United States General Accounting Office Y2k BCP Guide (August 1998). United States Government Accountability Office.
International Organization for Standardization
- ISO/IEC 27001:2005 (formerly BS 7799-2:2002) Information Security Management System
- ISO/IEC 27002:2005 (renumerated ISO17999:2005) Information Security Management – Code of Practice
- ISO/IEC 27031:2011 Information technology - Security techniques - Guidelines for information and communication technology readiness for business continuity
- ISO/PAS 22399:2007 Guideline for incident preparedness and operational continuity management
- ISO/IEC 24762:2008 Guidelines for information and communications technology disaster recovery services
- IWA 5:2006 Emergency Preparedness
- ISO 22301:2012 Societal security - Business continuity management systems - Requirements
- ISO 22313:2012 Societal security - Business continuity management systems - Guidance
British Standards Institution
- BS 25999-1:2006 Business Continuity Management Part 1: Code of practice
- BS 25999-2:2007 Business Continuity Management Part 2: Specification
- "A Guide to Business Continuity Planning" by James C. Barnes
- "Business Continuity Planning", A Step-by-Step Guide with Planning Forms on CDROM by Kenneth L Fulmer
- "Business Continuity Plan Design, 8 Steps for Getting Started Designing a Plan" By Richard Kepenach
- "Disaster Survival Planning: A Practical Guide for Businesses" by Judy Bell
- ICE Data Management (In Case of Emergency) made simple – by MyriadOptima.com
- Harney, J.(2004). Business continuity and disaster recovery: Back up or shut down.
- AIIM E-Doc Magazine, 18(4), 42–48.
- Dimattia, S. (November 15, 2001).Planning for Continuity. Library Journal,32–34.
- Exercising for Excellence (Delivering successful business continuity management exercises) by Crisis Solutions
|Find more about Business continuity planning at Wikipedia's sister projects|
|Definitions from Wiktionary|
|Media from Commons|
|Quotations from Wikiquote|
|Source texts from Wikisource|
|Textbooks from Wikibooks|
|Learning resources from Wikiversity|
||This article's use of external links may not follow Wikipedia's policies or guidelines. (August 2010)|