ABSTRACT
Disaster Recovery, well disaster means like damage or loss, and this disaster can be happened by nature and by human error. Firstly, natural disasters such are earthquakes, floods, hurricanes, etc., since preventing natural disasters is highly impossible, so risk management treats them as avoiding disaster-prone situations, but good planning can be helpful. Secondly, 60% disasters can be happed with human error such are, risky material spills, infrastructure failure, bio-terrorism, and disastrous IT bugs or failed software updates. Recovery, means get the lost data back, but how, all we know that once it is lost we lost. So this is where the Disaster Recovery has developed in late 1970.
Keywords: Disaster Recovery Plan, Risk, Business Continuity Planning
INTRODUCTION
The disaster recovery includes set of policies and procedures to alter the recovery of critical technology and infrastructure. Disaster Recovery mainly focuses on Information Technology Systems which helping typical business functions. There are many recovery strategies, one is replicating the business’s production data into the disaster recovery site, and the other is storing the business organization’s infrastructures, such as server, files and data, into the disaster recovery site from the backup continuously like per week, per month or per year. This whole process of recovering is called the recovery procedure.
The disaster recovery also includes planning/designing and testing, and may involve a different physical site to restoring the files and data. Disaster recovery planning gives a ideal design approach for taking an action to unplanned occurrences that endangers company’s IT infrastructure, that includes software, hardware, internal networks, procedures and employees. In disaster recovery planning, testing is vital role to the change management, which is a systematic approach to deal with change from business prospective. Testing helps to discover the gaps and giving an opportunity to redo the actions in the event occasion. Since there are lots of moving steps in DR planning, testing can be helpful to the IT business organizations understand what their employees are doing while the disaster recovery scenarios occurring.
IMPORTANCE OF DISASTER RECOVERY AND BUSINESS CONTINUITY PLAN
Disaster recovery and business continuity planning are two essential parts for an organization of risk management. Its highly complicated to eliminate all risks, organizations are implementing disaster recovery and business continuity plans for facing disruptive events. The both processes are important equally because they analyze the risks and provide detailed strategies on how the business will continue after severe interruptions and disasters. Disasters have different forms. Primarily some disasters impact individuals. Few of them are hard drive meltdowns and some have major collective impact like fires, storms, power outages etc. each of these can cause short-term disruptions in normal business operation. But recovering from major disasters can take much time, if organizations are not pre-planned.
Detailed disaster recovery plans can prevent many issues at the time of disaster in organization. By having well practiced plans, not only for network recovery, but also plans that precisely outline what steps each person involved in recovery efforts should undertake, an organization can improve their recovery time. There are several options available for organizations to use once they decide to begin creating their disaster recovery plan. The first and often most accessible source a business can draw on would be to have any experienced managers within the organization draw on the knowledge and experience they have to help craft a plan that will fit the recovery needs specific to their unique organization. For organizations that do not have this type of expertise in house, there are a number of outside options that can be called on, such as trained consultants and specially designed software.
Earlier when a business wanted to find the ways to prepare itself against disaster and ensure business continuity should strike accidentally, the volume of the organization's time, money, and effort would be spent on ways that disasters could be avoided altogether. The outcome of an organization's search for ways to protect their most analytical business applications, was that they found they could potentially avoid harm through the use of unnecessary data lines. As news of this information spread, it did not take long before the words "disaster" and "recovery" were replaced by "continuity" and "resumption." While a small percentage of private organizations were still dedicated to disaster recovery as one way of continuing business continuity, the volume of the focus was placed on disaster avoidance. since few years however, that archetype has shifted and a new kind of disaster preparation has replaced that thinking type. Avoidance is a great idea in theory, but cannot always be repeated in real life.
Paul Kirvan notes several other reasons for the importance of business continuity and disaster recovery planning:
• Results of the BIA identify opportunities for process improvement and ways the organization can use technology better.
• Information in the plan serves as an alternate source of documentation.
• The plan provides a single source of key contact information.
• The plan serves as a reference document for use in product planning and design, service design and delivery, and other activities.
TYPES OF DISASTER RECOVERY PLAN
Disaster recovery plan also abbreviate as DRP. DRP is documented process or set of instructions or procedure to recover and protect business IT infrastructure in the event of disaster, nature and type of disaster could be natural or manmade. We can distinguish disaster recovery plans into two categories as their solutions and answer to their disaster recovery plan lies within type of problem.
1. Human made errors: Arson, Electrical fires, riots or civil unrest, hacks, malwares.
2. Natural disaster: It includes such as tornadoes, hurricanes, earthquake, and Ice storm.
These disasters can instantaneously disable the transmission of the data or cause that not able to recover damage happens to IT system. Damage to the It infrastructure means one cannot access the information they store in these locations.
While preparing disaster recovery plan, planner need to consider all possible manmade and natural disasters. Typically, following are the disaster plans used in general.
Arson: It is deliberate act of setting fire to the building. It can be set up by companies own employee with ill intentions, or even by rivals. Arson can be set up to the location, IT infrastructure or even specific location as one office or documentation room.
Solution: Have nice anti fire alarm system. Make sure electronics does not get drenched into the water but at the same time it should not engulfed into the fire. Fire extinguisher is good solution but it need human assistance. Another and most effective way invented recently is to auto evacuate oxygen from the server chambers. But real threat is if any human got stuck into the chambers during fire, it will be life threating to them, but even to that solution have brought into existence is to keep oxygen cylinders into the chambers. It is expensive process though. Most common process is still in use is to use fire alarm system.
Electrical Fire: It is different than Arson, Electrical fire is needed to be specific to the electrical components and it can be cause of negligence of employee or old wiring or even faulty connections. As you see it is different than Arson, prevention to this has different solution as well.
Solution: Frequent scheduled checking of electrical wiring and components helps to great extent. Company need to make sure they change IT infrastructure and its components after every few years. Good insulated anti fire wiring will be big help. Spending little more on the good products causes more security than harm. Maintenance guys always need to check wirings and infrastructure. They need to make sure no wiring is going through heated are where they can catch fire as well as rats many times cut the wire and have found reason for wire short circuit and eventually catching up fire. So, rats need to keep away from wire and mice traps need to be set up around important wiring areas.
ICE Storms: Usually ICE storm is a big problem in the northern hemisphere, where temperature goes real below to non-operational of electrical components.
Solution: Only and most reliable solution to this is have backup center and another location far away from Ice storm affected location so that normal operations keep in run and data remains unaffected. Ice storm many times case power failure to the unit. One can also keep backup power generator but still it will be in danger zone. Moreover, having human resources to work at location during ice storm is critical thing and one can risk his or her life.
Power outage: It may cause due to several reasons starting from the hurricane, or even small reason such as maintenance work schedule by county.
Solution: Each company these days to keep extra power sources with them to avoid inconvenience in such events. In the US, some company keeps two power supply to their location and make sure they are not coming from the same location or not each other’s vendors. Again, keeping another location back up is helpful in this event.
Riots or Civil unrest and Vandalism: Many of them are not so irregular these days. It is kind of man-made threat to the company
Solution: It is a serious threat, even if not to the infrastructure, but to the employees. So, in such event, you cannot ask your employees to work for you. Again, the best solution is to keep multiple location units with back up date.
Above discussed are some general disasters and its recovery or precautionary plans. But in general, for all kind of business and with not categorized by classes, we can go through plans as follows:
No recovery plans at all: Even though it looks impractical in today’s time, but there are a large number of organizations who do not opt for any disaster recovery plans because they have a) financial constraints,
b) Unaware or lack of knowledge c) Not having professional experts to execute DRP.
d) Really basic product base company where computer interaction is less
No disaster plan but a good back up plans: Some companies do not prefer to keep expensive back disaster plans but they keep a good back up plans, these companies are mainly being data based companies where data is their main way of operation. Data centers and cloud computing are at big helps for these companies. So, in the event of hardware failure, operations of this company do not halt. The company also need to check on a regular basis how recovery data base responds otherwise in need of time it causes more chaos than comfort.
Split site disaster recovery solution: If your organization is big in size, one can set up two or multiple locations and have data back up at all of those locations. For example, Facebook has Chicago, California and Nevada database centers, each has different geographical and weather conditions.
Risk assessment plays a vital role in DRP because this assessment plans critical situation that could happen and precautions can be taken accordingly.
TYPES OF DISASTERS
Disasters are two types, natural or man-made disasters. Disasters can take a wide range of structures, and the span can go from an hourly interruption to days or long stretches of continuous obliteration. The following is a rundown of the different sorts of disasters – both normal and man-made or mechanical in nature – that can affect a group.
Natural Disaster:
A natural disaster is a major damage result from the earth’s natural hazards.
Man-made disasters:
It could be intentional (it’s kind of an act of terrorism) or unintentional (that is, accidental, for example, the breakage of a man-made dam)
Man-made disasters:
For organizations crosswise over social insurance, money related administrations and instructive businesses, where exceptionally sensitive information is controlled, the need to keep up the accessibility and respectability of information is basic with a specific end goal to meet stringent consistence necessities. Be that as it may, one of the essential guilty parties for information accessibility and respectability issues comes as disasters.
Bioterrorism
The purposeful discharge or spread of natural specialists as a method for compulsion
Cyber Attacks:
Cybersecurity includes anticipating, distinguishing, and reacting to digital episodes that can have far-reaching impacts on the individual, associations, the group and at the national wise.
Power Failure:
Caused by summer or winter tempests, lightning or development hardware delving in the wrong area
Blasts:
Dangerous gadgets can be exceedingly convenient, utilizing vehicles and people as a method of transport. They are effectively exploded from remote areas or by suicide planes. There are steps you can take to plan for the unforeseen.
Atomic Blast
An atomic impact is a blast with serious light and warmth, a harming weight wave, and across the board radioactive material that can taint the air, water, and ground surfaces for miles around. An atomic gadget can go from a weapon conveyed by an intercontinental rocket to a little compact atomic gadget transported by a person. Every atomic gadget causes destructive impacts when detonated.
Natural Disasters:
Natural disasters, for example, tornadoes, storms, or quakes, can promptly debilitate the transmission of information or make hopeless harm to IT frameworks used to keep up this capacity notwithstanding the data they store. In any case, ordinarily, organizations disregard different sorts of disasters that could influence their endeavor including man-influenced disasters and less-normal to climate-related occasions.
Earthquakes:
A seismic tremor is the sudden, quick shaking of the earth, caused by the breaking and moving of underground shake as it discharges strain that has collected over quite a while. Starting it is slow shaking becomes extreme high inside seconds. Extra earthquakes, called delayed repercussions, may take after the underlying quake. Most are littler than the underlying seismic tremor, however, bigger extent consequential convulsions additionally happen. Seismic tremors may make family unit things end up noticeably hazardous shots; make structures get off establishments or fall, harm utilities, streets, and structures, for example, extensions and dams, or cause shoot and blasts. They may likewise trigger avalanches, torrential slides, and tidal waves.
Hurricanes:
Hurricanes are enormous tempest frameworks that start over the water and push toward land. Dangers from hurricanes incorporate high breezes, storm surge, beach front and inland flooding, tear streams, and tornadoes. These vast tempests are called tropical storms in the North Pacific Ocean and violent winds in different parts of the world.
Tsunami:
A progression of water waves caused by the relocation of an extensive volume of a waterway, regularly a sea or a substantial lake, typically caused by seismic tremors, volcanic emissions, submerged blasts, avalanches, ice sheet calving’s, shooting star impacts and different aggravations above or underneath the water
Heat Wave:
A delayed time of exorbitantly sweltering climate with respect to the standard climate example of a zone and in respect to typical temperatures for the season
Avalanche:
The sudden, extraordinary stream of snow down an incline, happening when either regular triggers, for example, stacking from new snow or rain, or simulated triggers, for example, explosives or backwoods skiers, over-burden the snowpack
DISASTER RECOVERY PLAN METHODOLOGY
The United Nations Office for Disaster Risk Reduction promoted the International Strategy for Disaster Reduction, wherein the body reflects the philosophy of prevention of disaster rather than the conventional process of response to the disaster. Although it is inevitable to avoid hazards and disaster, it is possible to minimize the effect of disaster by taking executing proper precautionary strategies. Hence the ideology promoted in the International Disaster Strategy was to transform from the methodology of protection against hazards to adopting a process to manage risk by integrating risk preventive measures. (UNISDR, 1999)
According to Neil Rosenberg and Geoffrey Wold, the Disaster Recovery Plan methodology provides a framework that outlines the entire procedure and steps involved to accomplish a successful implementation of the plan. The points below mentions the several steps involved in the Disaster Plan Recovery Plan Methodology.
1. Obtaining the go/ no-go signal from the top management
To get the most out of the recovery plan the top level management team should be completely committed to the development of the strategy and planning effort. The effectiveness of a successful disaster planning requires allocation of time and resources, which can be allocated only the top level management is interested in the disaster recovery and risk reduction plans. (Wold, 1997)
2. Appointing a Committee to plan and implement the recovery strategies
A planning g committee is necessary to establish in order to initiate the planning process, allocating resources for the recovery planning commitment. The committee is required to include team members from every department which may be impacted post-disaster. The key members of the Disaster Recovery Plan Committee are usually the Operations Manager and the Data Processing Manager. (Wold, 1997)
3. Defining the assets under threat and assessment of risk
The planning committee starts the recovery plan by identifying relevant assets which may potentially incur losses in the situation of a disaster. The team analyzes the business impact of the assets, the value provided by the assets, the risk associated in the loss of these assets and dollar amount of loss that could be generated in the event of a disaster. (Wold, 1997). For example, the Accounting/ ERP System, CRM Systems, Files, Documents, Cash, and Valuables are part of the assets of which the values are required to be evaluated. (Rosenberg, 2006).
4. Ascertain the recovery window
Once the team identified the assets against which recovery plan is necessary, following that it is important to understand how much time the resources will be unavailable due to disaster. Therefore it is of utmost importance to identify the mission-critical assets and evaluate the business value of every asset under the plan. Based on the business value the team should prioritize the recovery window for every asset, or in other words, this analysis lays down the time lag for which the system will be down and unavailable for use. (Rosenberg, 2006).
5. Determine the solution to facilitate recovery
The next step after the recovery window is determined should be the main solution to recover the data in the event of loss of information post-disaster. The type of back up created for every asset type may vary such as tape backup, disk backup, and data replication into a hot failover server in another offsite location. Similar to the identification of asset type and recovery window, the recovery solution also depends on the business value tied to each asset. (Rosenberg, 2006).
6. Preparing the Disaster Recovery Plan draft
Based on the steps mentioned in the points above the disaster recovery plan is drafted that specifies logistical considerations in the event of any mishap. Emergency operations location is planned and stated in the document where employees and management people responsible to carry out the disaster recovery plan can come together to execute the tasks as laid down in the plan. (Rosenberg, 2006).
7. Planning regarding the Disaster Recovery Site
Planning for a disaster recovery site refers to a particular situation when the actual data center or main facility is not available to support working conditions, therefore a backup plan needs to be put forward to select a site where the work can be continued without any issue. There are three different recovery site- 1) Hot Site- refers to the site where data is readily available to use in case of failover; 2.) Warm Site- refers to the site where hardware is readily available along with the data but software is not readily available but have to be installed; 3) Cold Site- refers to the facility where employees can take shelter in the event of a disaster is declared. (Rosenberg, 2006).
8. Documenting the detailed final Disaster Recovery Planning
Once the above steps are completed the draft of disaster recovery plan should be revisited to enhance the document with necessary changes so that the document forms as the guideline to follow if there is any disaster. The disaster recovery planning document should be clear enough so that there should not be any scope of interpretation at the time of crisis. (Rosenberg, 2006).
9. Testing the viability of the Disaster Recovery Plan
To foolproof the viability of the disaster recovery plan it is highly important to check whether the steps laid down in the plan actually works out in the situation when it is supposed to work out. To test the plan a checklist is created to test all the necessary important points. Usually, the person selected to test the disaster recovery plan is a selected person who was not part of preparing the actual document so that the tester may have a neutral judgment. Lastly the testing of the recovery plan should be repeated annually because the business changes dynamically and testing regularly is helpful to find out the loopholes. (Rosenberg, 2006).
CONCLUSION
In conclusion, Disaster Recovery designing/planning is not just a job to do, but it is necessary for IT business environment as a strategy to be successful these days. This recovery system helps not only to just recover the lost data, but it also helps IT businesses to save and regain their Software, Hardware, Operating Systems, files and their own customized settings. This is how they will be out of Malfunction, which causes to their Hardware infrastructures. This Disaster recovery planning helps to restore their business Software/Hardware functionality quickly and even easily. It also can save the IT business’s valuable time and reduces costs on other resources.