Disaster recovery and business continuity auditing

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Disaster recovery (DR) and business continuity refers to an organization’s ability to recover from a disaster and/or unexpected event and resume operations. Organizations often have a plan in place (usually referred to as a "Disaster Recovery Plan", or "Business Continuity Plan") that outlines how a recovery will be accomplished. The key to successful disaster recovery is to have a plan (emergency plan, disaster recovery plan, continuity plan) well before disaster ever strikes.

Given ever-changing business objectives, one common need in disaster recovery is to perform an audit of the disaster recovery capacity of an organization. The purpose of such audit is to discover how closely an organization's disaster recovery readiness aligns to actual organizational objectives. When conducting an audit of a disaster recovery plan, factors such as alternate site designation, training of personnel, and insurance issues are considered. In conducting a disaster recovery audit, the individual or team performing the audit uses a number of procedures and processes to achieve the objectives of the audit. Successful disaster recovery audits clear state their objectives in an audit plan.

Metrics[edit]

Some of the key metrics to be measured in a disaster recovery environment are the Recovery Time Objective (RTO) and Recovery Point Objective (RPO). RTO is a metric that measures the time that it takes for a system to be completely up and running in the event of a disaster. RPO measures the ability to recover files by specifying a point in time restore of the backup copy.

Mission statement[edit]

A disaster recovery mission statement is used to identify the purpose and goals of the disaster recovery plan. The mission statement can also help an auditor obtain a better understanding of the organization’s environment. An auditor examined the mission statement to determine the objectives, priorities, and goals of the disaster recovery plan.

The DR committee and auditor[edit]

The organization appoints individuals responsible for designing and implementing the disaster recovery plan when needed. Generally, this consists of a team headed by a project manager, with a deputy manager who has the capability to take over the responsibilities if needed. The qualities needed for this position vary depending upon the organization. A good disaster recovery plan project manager is often someone who has good leadership abilities, strong knowledge of company business, strong knowledge of management processes, experience and knowledge in information technology and security, and of course, good project management skills. Other members of the team need to have a clear understanding and ability to perform the requisite procedures.

An auditor is assigned to examine and assess the project manager and deputy project manager’s training, experience, and abilities as well as to analyze the capabilities of the team members to complete assigned tasks and that more than one individual is trained and capable of doing a particular function. Tests and inquiries of personnel can help achieve this objective.

Organizations, particularly large organizations, ordinarily assign the task of determining, on an ongoing basis, if the procedures stated in the disaster recovery plan are actually consistent with real practice to a specific individual within the organization. This individual may be referred to as the disaster recovery officer, the disaster recovery liaison, the DR coordinator, or some other similar title. Some of the techniques used to determine such consistency are direct observation of procedures, examination of the disaster recovery plan, and inquiries of personnel.

Documentation[edit]

To maximize their effectiveness, disaster recovery plans are documented in written form and in a manner that is easily understood by those who will need to use it. In addition, the plan must also be readily available as well, since digging for a hard-to-find or misplaced disaster recovery plan at a time of a disaster can complicate the effect of the disaster. Furthermore, because of the constant changes that occur in the modern business environment, disaster plans are most effective when updated frequently. This way, the plans will also cover new and existing threats as such threats develop. Adequate records need to be retained by the organization. The auditor examines records, billings, and contracts to verify that records are being kept. One such record is a current list of the organization's hardware and software vendors. Such list is made and periodically updated to reflect changing business practice. Copies of it are stored on and off site and are made available or accessible to those who require them. An auditor tests the procedures used to meet this objective and determine their effectiveness.

Strategies[edit]

Site designation[edit]

A hot/cold site is a location that an organization can move to after a disaster if the current facility is unusable. The difference between the two is that a hot site is fully equipped to resume operations while a cold site does not have that capability. There is also what is referred to as a warm site which has the capability to resume some, but not all operations. The decision a company makes when determining what type of site to establish often hinges on the results of a cost-benefit analysis as well as the needs of the organization. A disaster recovery plan spells out how relocation to a new facility is to be conducted. Companies perform occasional tests and conduct trials to verify the viability and effectiveness of the plan and to determine if any deficiencies exist and how they can be dealt with. An audit of a company's Disaster Recovery Plan primarily looks into the probability that operations of the organization can be sustained at the level that is assumed in the plan, as well as the ability of the entity to actually establish operations at the site. A review of the disaster recovery plan generally involves examining and testing the procedures included, conducting outside research relating to Disaster recovery, determining reasonable standards relating to implementation, and touring, examining, and researching the outside facility.

The auditor can verify this through paper and paperless documentation and actual physical observation. Testing of the backups and procedures is also performed to confirm data integrity and effective processes. The security of the storage site is also confirmed.

Data backup[edit]

Data backups are central to any disaster recovery plan. An audit of backup processes determines if (a) they are effective, and (b) if they are actually being implemented by the involved personnel. Some techniques that are used to accomplish this include direct observation of the processes in question, analyzing and researching the backup equipment used, conducting computer-assisted audit techniques and tests, examining of paper and paperless records.

The continual backing up of data and systems can help minimize the impact of threats. Even so, the disaster recovery plan also includes information on how best to recover any data that has not been copied. Controls and protections are put in place to ensure that data is not damaged, altered, or destroyed during this process. Information technology experts and procedures need to be identified that can accomplish this endeavor. Vendor manuals can also assist in determining how best to proceed.

Drills[edit]

Practice drills conducted periodically to determine how effective the plan is and to determine what changes may be necessary. The auditor’s primary concern here is verifying that these drills are being conducted properly and that problems uncovered during these drills are addressed and procedures designed to deal with these potential deficiencies are implemented and tested to determine their effectiveness.

Backup of key personnel[edit]

A disaster recovery plan includes clearly written policies and specific communication with employees to ensure that both regular and replacement personnel is selected, documented, and informed should a disaster occur. There must also be confirmation that the replacement personnel can actually do the duties assigned to them in an event of an emergency. Periodic training and cross-training is often used to accomplish this. This training includes updates to existing job positions and testing to confirm proficiency. Some of the issues related to this activity verify that (1) policies are being enforced, (2) testing is effective, and (3) training is adequate.

Other considerations[edit]

Insurance issues[edit]

The auditor determines the adequacy of the company's insurance coverage (particularly property and casualty insurance) through a review of the company's insurance policies and other research. Among the items that the auditor needs to verify are: the scope of the policy (including any stated exclusions), that the amount of coverage is sufficient to cover the organization’s needs, and that the policy is current and in force. The auditor also ascertains, through a review of the ratings assigned by independent rating agencies, that the insurance company or companies providing the coverage have the financial viability to cover the losses in the event of a disaster.

Effective DR plans take into account the extent of a company's responsibilities to other entities and its ability to fulfill those commitments despite a major disaster. A good DR audit will include a review of existing MOA and contracts to ensure that the organization's legal liability for lack of performance in the event of disaster or any other unusual circumstance is minimized. Agreements pertaining to establishing support and assisting with recovery for the entity are also be outlined. Techniques used for evaluating this area include an examination of the reasonableness of the plan, a determination of whether or not the plan takes all factors into account, and a verification of the contracts and agreements reasonableness through documentation and outside research.

Communication issues[edit]

Good disaster recovery planning ensures that both management and the recovery team have disaster recovery procedures which allow for effective communication. This can be accomplished by ensuring contact information is easily accessible and that drills conducted test for communication abilities. A good disaster recovery plan includes not only internal communication considerations but external issues as well. Such external communications considers issues related to communication between the organization and outside individuals and organizations, such as business partners. Procedures to test this communication capability generally mirror those of the organization itself. The disaster recovery evaluates these procedures and assumptions to determine if they are reasonable and likely to be effective. Some techniques used by a DR auditor in evaluating readiness include testing of procedures, interviewing employees, making comparison against the DR plans of other company and against industry standards, and examining company manuals and other written procedures. The auditor can verify through direct observation that emergency telephone numbers are listed and easily accessible in the event of a disaster.

Emergency procedures[edit]

Procedures to sustain staff during a round-the clock disaster recovery effort are included in any good disaster recovery plan. Procedures for the stocking of food and water, capabilities of administering CPR/first aid, and dealing with family emergencies are clearly written and tested. This can generally be accomplished by the company through good training programs and a clear definition of job responsibilities. A review of the readiness capacity of a plan often includes tasks such as inquires of personnel, direct physical observation, and examination of training records and any certifications.

Environmental issues[edit]

Disaster recovery plans may also involve procedures that take into account the possibility of power failures or other situations that are of a non-IT nature. Such plan indicates what procedures to be used in this situation and also includes information on storage of flashlights and candles, as well as additional safety procedures in case of gas leaks, fires or other such phenomena. Trial runs are conducted to test the procedures' effectiveness and viability. The readiness of an organization in this regard can be assessed by examining and testing procedures for reasonableness, making inquiries on personnel, and conducting outside research.

See also[edit]

External links[edit]

References[edit]

  • Messier jr., W., F. (2003) Auditing & Assurance Services: A Systematic Approach. (3rd ed.) New York: McGraw-Hill/Irwin.
  • Gallegos, F., Senft, S., Manson, D., Gonzales, C. (2004). Information Technology Control and Audit. (2nd ed.) Boca Raton, Florida: Auerbach Publications.