Information system (IS) events that make data unreachable can easily force customers away from businesses, hard earned market positions, and well-fought reputation. Information management studies preceding this paper identify data readiness as incredibly important, and literature on disaster recovery and business continuity describes ways of preparing for and mitigating IT incidents (Jarvelainen, Jonna). This paper looks to expand on the multiple schema and perspectives that make up the business continuity process and its influence on involved participants. As will be described throughout this text, no definitive framework yet exists to evaluate the effectiveness of continuity programs and return on investment. On the practical level, social factors such as committed managers and employees are influential in decreasing negative business decisions. However, analytical approaches to such assertions are bound in experience with loosely tied policies and procedures which emphasize a business continuity mindset. Much of the IS continuity discussion holds qualitative in nature to this day despite attempts otherwise.
Initial public offering of Facebook fails, due to “technical error” in NASDAQ trading system, which did not have enough capacity to handle all the trades. Thousands of unsatisfied traders, initial market price of the stock set too high, law suits pending (Pepitone, 2012).
Tens of thousands of travelers book their airfare through an online broker. A glitch collapses the system. Unable to escape the negative reviews of its patrons, the business is held in bad press for weeks, if not months. Ultimately, it gets the lowest ranking from its industry peers for the year (Heiskanen, 2012). Information technology (IT) and information system (IS) incidents affect business operations and, as many examples show, may also have stark business impacts. Businesses know that IS continuity is the cornerstone to effective and stable marketing operations (Luftman & Zadeh, 2011). Attempting to maintain these complex systems is the goal of IT management, but it’s to be expected that failures happen regardless one’s watchfulness. Perfectly complex systems do not always perform perfectly (Butler & Gray, 2006; Mithas, Ramasubbu, & Sambamurthy, 2011). Guaranteeing nonstop IT and IS operations is just one of the many responsibilities left with information security management teams (Fink, 1994; Gerber & Von Solms, 2005), though day to day operations can be disrupted by non-threatening processes as well. Before the advent of continuity functions within IS and IT, original processes involved disaster recovery planning (DRP). Businesses have become convinced that they should make contingency plans for coping with IT and IS incidents, and ensure that mitigating courses of action are set in place (Chow & Ha, 2009; Turetken, 2008).
How can we tell if what we do is helping? Are we really stopping IS incidents from occurring? The active process found with business continuity management frameworks developed by Herbane, Elliott, and Swartz (2004) work great for establishing an IS and IT continuity effort (from herein IS will include the technological and embedded resources found within IT). Business continuity management is far removed but not exclusively separated from its roots in disaster recovery planning, joining multiple practices from “risk, crisis and supply chain management” (Herbane et al., 2004).
Disaster recovery means recovering from a lost event. Herbane asserts that “This is broadened in business continuity management to include the identification and avoidance of potentially damaging incidents that would have a severe business impact”. Additionally, it proposes convenient trappings for executives concerned with recent business consequences that don’t necessarily involve strict security concerns.
Continuity management encompasses a broad range of business operations, but it’s reasoned here that proactive work in the IS perspective could easily foster an evolution and help support the role of IT resources. Each business in the world will have its own methodology for implementing IS continuity management in order to avoid the possibility of negative incidents. Context will always play a big factor here. The following article will discuss some of the technical and social dynamics which minimize such impacts. Technical factors include plans and alternative arrangements that enhance organizational alertness and preparedness. Embedding continuity practices in the organization refers to social factors such as the commitment of the personnel and policies set forth by management.
A Brief Literature Review
The ability to successfully transmit and receive data is indicative of a strong organizational IS management team (Mithas et al., 2011). However, IS research has a tendency to assume that business operations encounter more than the feel-good scenarios seen so often in research. Simply put, contemporary research is built around reliable and infallible information systems, but reality tells a different story. As past research has revealed, information availability and the related continuity of information systems, is upheld by those in charge of information security (Von Solms & Von Solms, 2004), although IS cases can happen without reasons dictated by security. Many IS studies tend to focus on financial security (Benaroch, Lichtenstein, & Robinson, 2006), IS projects (Barki, Rivard, & Talbot, 2001), outsourcing (Bahli & Rivard, 2003) and security (Straub & Welke, 1998). Technology has a penchant for failing and its opportunity to do so is known (Sherer & Alter, 2004), however pertinent details are described elsewhere, leaving IS management without much help.
Theoretical Background: Ensuring Continuity in Information Systems
When business enterprises began to rely on computer infrastructure for organizational processes and things went wrong, disaster recovery planning (DRP) was created in response (Herbane, 2010a). According to Ernst & Young’s Global Information Security Survey (Ernst & Young, 2011), business continuity management is and will be at the forefront of business investment for the foreseeable future. Multiple agencies and industry leading groups have picked up on the research, and there are available multiple guidelines aimed to cure businesses of their Information System continuity woes. Unfortunately, most (if not all) of these guidelines are seen as too vague and generalist (Siponen & Willison, 2009). Without a doubt, disaster recovery has its purpose, but a new process is needed for evolving IS and IT markets. What comes to mind is an idea that damage should be halted before it even has a chance to ruin a business. Not surprisingly, processes like business process continuity and Information System management are being implemented by concerned organizations (Luftman & Ben-Zvi, 2011; Luftman & Zadeh, 2011). Business continuity management (BCM) covers a wide array of business considerations, all of which need to find and identify threat-sources to the company (Herbane et al., 2004). Despite its origins from Disaster Resource Planning it finds good use mitigating such issues as supplier problems in manufacturing and major pandemics affecting operations (Herbane et al., 2004). Very similar to other risk assessment stratagems, organizations using risk and information system management must first identify the business involved (assets). Following that, management must identify risks, assess their probability of happening, determine their impact towards the company, and consider whether or not they overlap into business operations (Gibb & Buchanan, 2006). Being able to determine which systems are most prone to failure and critical to mission success is thus an important skill, although other processes must be considered as well. The following discussion looks to focus on how business organizations maintain certain degrees of IS continuity. Herbane (2004) developed a framework for information systems continuity implementation, but it has never been tried. He hypothesized that “recovery speed, resilience, the embeddedness of BCM practices and external obligations serve the organization in terms of preserving its value”. Resultant, he came to the conclusion that an institution which can recover from Information System events faster than its peers maintains a distinct competitive advantage (Herbane et al., 2004). Our discussion touches on this framework and also adds on additional business operations.
It goes without saying that certain crises bring a company’s public relations to the forefront of a hating public. Similarly, this author argues that a disruption in service is equally destructive to a company’s image. After all, preceding attempts at good will and customer value are wasted if wanting customer can’t receive their product due to IS downtime. If the incident involves a public facing portion of the company, it could easily result in a damaged brand or reputation. According to a study conducted in Norway, “service disruptions have significant negative effects on customer loyalty”. As an example in the previously mentioned text, “One Nordic bank lost 30,000 customers because of a long-drawn-out incident during an IS merger” (Luoma-aho & Paloviita, 2010; Wang, Wu, Lin, & Wang, 2010). Considering this customer-centric model, the ideas surrounding IS continuity are clear: service interruptions mean bad business practice. Best practices to fight IS emergencies or events is to counter them before they become something larger in the face of company operations. For example, in many countries, finance and health-care sectors are required to ensure continuity in IS operations according to governmental regulations (Elliott, Swartz, & Herbane, 2010). For better or worse, this expectation has spread far beyond urgency filled industries. Customers anywhere in the globe require instant satisfaction so long as companies compete for the largest market share. After all, who would want to waste an extra half-minute on a web page that doesn’t load when you can jump to another website (Parasuraman, Zeithaml, & Malhotra, 2005). Business-to-business customers aren’t immune from the desperate need for always-on IS either (Choudhuri, Maguire, & Ojiako, 2009). As globalization sweeps across the globe more and more businesses require constant connectivity and continuity with other organizations. Government authorities and government customers also encourage IS management to protect continuity of information systems and technology (Herbane et al., 2004). Since top management is often held accountable for the actions of their operations, they tend to adhere to regulation without much additional persuasion.
Maintaining Information System Continuity
Having an outside force that demands IS continuity is extremely important. Worker and management dedication to the processes that integrate and reinforce continuity throughout daily operations lends itself to strong leadership and objectively proficient organizational governance (Elliott et al., 2010). Utilizing international or national standards related to business and IS continuity is one such externality that motivates and focuses management towards IS continuity goals. However, external guidelines and regulations tend to be based on previously established governmental frameworks (e.g., National Institute of Standards and Technology, NIST) or at least appear similar (Järveläinen, 2012). Adhering to guidelines and regulations set forth by standardizing agencies means that a certain degree of oversight must be performed, to include highly visible audits and inspections, thus creating more employee involvement with continuity processes (Bernard, 2007; Gibb & Buchanan, 2006). Alesi (2008) argues in her research that IS continuity must be integrated at all levels of business and unified into a resilient structure that supports all business practices. Ultimately, IS continuity becomes as much a part of business culture as any other business process. Other useful techniques for strengthening employee commitment include reward systems, training, exercises and tailored communication (Alesi, 2008; Herbane et al., 2004; Puhakainen, 2006). External requirements integrate continuity into organizations in a more comprehensive way.
Exploitation of Technical and Social Factors
Technical factors: Plans and Alternative Arrangements
Information system continuity is an active process, and as such requires the continued input of people and organizations. However, the critical components identified and analyzed as critical to business operations will always include the information technology that underlies its day to day operations. Without it, employees will fail to deliver on time services to customers while simultaneously expending time on avoidable recovery processes. Contextual IS operations and IS incident specific events produce a dynamic continuity requirement, and every business will approach its IT requirements differently. It’s the hope of this author with support from the research that businesses produce failover backups for all IS resources applicable to business competencies in the event of an incident or emergency.
Defined, failover is a backup system or process in which the functions of a system in their entirety are replicated and transferrable to a secondary system (such as a processor, server, network, or database, for example) when the primary system becomes unavailable because of system down time or complete failure. Useful when planning redundant and continuity based systems, failover is typically a necessary part of business critical operations that require constant-on infrastructure. Procedurally, failover is a completely transparent process to the end user, and will automatically occur when a primary system failure occurs. Failover can cover every aspect of an organizations IT continuity plan: within a personal computer, for example, failover might be a mechanism to protect against a failed processor; within a network, failover can apply to any network component or system of components, such as a connection path, storage device, or Web server.
Prior to effective failover solutions, stored data was connected to primary servers using basic arrangements: usually point-to-point or cross-coupled (circa 1980’s and 90’s). When a singular server is used in this kind of configuration, any type of incident or failure (sometimes even maintenance) interrupted IS continuity and made customer accessibility near impossible for anywhere between a few minutes to weeks. This problem wasn’t resolved until the server was brought back online. Newer technologies like storage area networks (SANs), have shifted towards any-to-any connectivity which contain failover routing and redundant connection methods and processes. Generally speaking, storage networks today use several paths – each made up of whole sets of all the components involved – between the server and system. Since there are multiple routes the data can travel, connection paths, built with redundant components, are able to ensure the connection is still accessible even if one or more paths fail. The ability for automatic failover results in normal operations that can be sustained despite the inevitable incidents and interruptions caused by problems with IT hardware or software.
Consequently, it’s easy to draw conclusions about why IS continuity may prove difficult or hindered within an organization. In order to establish an organizationally sound IS continuity program, redundant technologies translates into additional capital investment. Sometimes more than can be afforded. Even today, determining the Return on Investment for such activities is an arduous and difficult task built on the shoulders of risk management and opportunity costs. Continuity requires the operation of multiple interrelated systems, which can be costly, and embedded continuity processes can add even more towards overhead costs. However, with motivated leadership working deliberately with an effective IT department, most IS continuity costs can be justified at the behest of customer satisfaction and increased productivity.
Embedded continuity practices
An organizations’ ability to identify IS incidents and how readily they’re able to handle IS incidents defines the recovery speed after an incident (Herbane et al., 2004). Herbane argued that “Organizational alertness and preparedness are easily improved if managers allocate resources and decide to implement back-up plans and form crisis teams”. Additionally, IS management and security need to take significant responsibility for and require deliberate oversight concerning continuity incidents (Ivancevich, Hermanson, & Smith, 1998; Seow, 2009; Wong, Monaco, & Sellaro, 1994). Delegating IS continuity responsibilities to the IT department for example, instead of with the CIO, can take away from the vertically integrated IS continuity that assists organizational alertness and readiness. At its best, IS continuity starts at the top and is enforced throughout the breadth of operations.
An additional practice is to create and implement a rigorous set of IS continuity policies and organizational processes, and to ensure that everyone across the entire business organization knows and understand their importance (Alesi, 2008; Morwood, 1998). For example, Alesi (2008) states that “Instead of giving the sales director responsibility for every information system in the IT department, it might be more beneficial to make him or her responsible for the customer relationship management system and its continuity”. Implementing this idea across different departments would alleviate pressure on the IT department while simultaneously emphasizing IS continuity among the entire organizations’ workers (Herbane et al., 2004). This will only come to fruition if top management supports the inserting of IS continuity practices throughout a company, heightened awareness and commitment would become part of the organizational culture for everyone (Alesi, 2008).
IS continuity is founded on the idea that a company can avoid or at least recover quickly from business incidents. A firm that can recognize potential hazards and rally their crisis team post haste, is said to be high in organizational alertness (Herbane et al., 2004). Preparedness refers to the understanding of numerous recovery methods and the evasion of risks, such as implementing IS continuity plans, creating crisis teams and building a patchwork of failover and redundant staffing (Ahmad, Hadgkiss, & Ruighaver, 2012; Chow & Ha, 2009; Lindström, Samuelsson, & Hägerfors, 2010). IS continuity plans, much like their business counterpart, need to be tested regularly and reformed according to changing environments, even after large incidents occur (Gibb & Buchanan, 2006). Preparedness will be improved dramatically if critical business processes and systems can be saved by more than one person (Conlon & Smith, 2010). Having the capability to identify critical systems and their impact on large business processes is accomplished with the help of a business impact analysis, identifying the relationships between internal and external systems, and making IS personnel (third party or otherwise) to follow IS continuity processes (Blos, Hui-Ming, & Yang, 2010; Herbane et al., 2004). Plans, analyses and continuity processes involve employees other than IT experts, and thus preparedness will affect the embeddedness of the practices (Herbane et al., 2004).
When tasked with handling millions of dollars of capital for the betterment of its customers, investment institutions are held at a unique Information Systems crossroads. Failure of its IT policies, procedures, and infrastructure could very easily ruin the livelihoods of its customers, while simultaneously destroying its own profits.
In this example, Accessor Capital Management LP needed an IS continuity plan that would allow for automatic Internet Service Provider (ISP) failover and real time data recovery in the event of a primary IS crisis. Without hesitation, they needed the ability to resume revenue generating operations without customer interruption. Their requirements were well defined:
• Separate geographic location of data backup in case of disaster
• Data had to be encrypted and backed up immediately in real time
• SEC regulations required that customers saw no interruption to their investment needs if an IS failure were to occur
• Sensitive data will be kept in-house, and not shared with third party vendors
Accessor mulled over several alternatives. A “hot site” was considered, which would house backup hardware separate from its primary facilities, but would also require full time staffing in the case of an emergency. However, this would result in a possibility of lost time and lost data if an incident were to occur. Alternative contracts required a tighter degree of integration than Accessor was comfortable with. In the end, Accessor would not share confidential client data or pay the prohibitive costs of a hot site.
Table 1. Types of Backup Sites
Site Cost Hardware Equipment Telecommunications Setup time Location
Cold Site Low None None Long Fixed
Warm Site Medium Partial Partial/Full Medium Fixed
Hot Site Medium/High Full Full Short Fixed
Instead, Accessor determined that they would contract support for what’s known as a “cold site”. Wherein Information System equipment (not office hardware equipment) and applications are stored in a low-maintenance, non-manned, and (in this case) nearby facility. They outsourced implementation of a secondary server system which would stand in automatically for the primary server. If something were to happen at their home office, operations would resume with minimal interruption. Additionally, by placing the backup server within a 45 minute drive of their Seattle headquarters, IT staff maintained accessibility but were far enough away to add another layer of data protection and operations security.
Backup equipment is one thing, but the SEC requirements also mandated a seamless continuity solution so long as Accessor was involved with investment funds. Accessor needed to not only protect its data in real-time, but it also needed to uphold data integrity, and offer transaction capability if its primary systems failed. “We’ve always had tape backup,” said Shawn Huey, Systems Engineer. “But tapes don’t run continuously, so they can’t provide up-to-the-minute data. In addition, restoring from tape is an uncertain and time-consuming task.” Utilizing third party IT vendors, Accessor eventually settled on a software package that would adhere to its data monitoring and integrity requirements. Its final software implementation involved monitoring changes to open files as they occur (in the primary system) and replicating only the modifications (in the secondary system). Thanks in large part to the reduced bandwidth usage, this could be accomplished in real time for a fraction of the cost of continuous full backups. Furthermore, because the main server continues processing at full capacity, users notice no interruption. Resultant, Accessor was able to increase its data protection capabilities and now had a safe backup if failures in the primary system occurred.
Posy-implementation, Accessor Capital Management maintained 99.99% system accessibility. During testing failover was seamless and unnoticeable to the end users. Accessor can boast less than fifty minutes of down time a year. System users (both internal and external) continue working without any awareness that failovers occur.
An Atlanta Electrical Company
In the following case study, a large number of customers are left in the dark due to a lack of information system continuity planning and development. Information Technology can be expensive when implemented incorrectly or in a hasty manner, but when done correctly can cost pennies per customer for a large return on investment.
An electric company serving reliable energy to more than 170,000 consumers in five north metro Atlanta counties completed a successful implementation of cost saving initiatives that increased the company’s productivity and made better value for its customers. After completing a business impact assessment, the electrical company determined it needed an upgraded Wide Area Network agreement and a new IS disaster recovery policy.
The company’s LAN/WAN resolved the Wide Area Network congestion and customer service bottle neck by piping in two unique WAN connections, with room for one more. “We’re about ready to add a third T1,” said the Director, to support its growing WAN infrastructure needs.
For increased data integrity and safety, critical IT equipment, to include its most important servers, are mirrored at a complex Hot Site, where every resource found at headquarters is replicated in a distant off-site location. In the event of a total disaster situation, seamless transition to the disaster site can be conducted for all operations and employees.
Each office is connected via new high speed modems, which bonds lines from separate ISPs connected by a fiber loop. This loop is a continuous failover solution that works across all attached locations. In the event of a failure at one office location another office can be used as a point of data entry. A total failover solution is maintained when all offices are interconnected with the disaster hot site where every bit of operational data is mirrored.
“The flexibility of the product and the fact that we’re not locked into one ISP and do not require ISP participation in the setup, makes it an easy implementation,” said the Director.
“It gives you super protection if one ISP goes down it automatically fails over to the second ISP. It is also a lot less expensive than getting a router – which can accommodate only two lines – to hold all the ARP tables needed for load balancing between ISPs,” he concluded. “We’re not going down anymore and we didn’t have to purchase that expensive router.” The router and implementation programming costs were estimated at $75,000.
Information Systems continuity planning is an incredibly important function of every business that requires every employee, every process, and every department to work together. From the business impact assessment to final implementation of an IS continuity program, a finely tuned set of procedures and policies needs to be put in place. Although difficult to identify, the resulting return on investment invokes more than additional revenue, but increased quality and value, happier customers, and a safer environment across the entire organization. The business continuity program should include participation from all levels of an organization, including an organization’s board of directors, senior management, business and technology managers, and staff. While the focus of this paper has been on the IT and IS aspect of continuity in the event of an incident it’s easy to forget that this is not a technology-only problem. Without business owners involvement in the IS continuity program, the effectiveness of plans are weakened and the recovery time during an event can be greatly extended or halted altogether.
Jarvelainen, Jonna. (2013). IT incidents and business impacts: Validating a framework for continuity management in information systems. International Journal of Information Management, 33(3), 583.
Herbane, Brahim, Elliott, Dominic, & Swartz, Ethné M. (2004). Business Continuity Management: Time for a Strategic Role? Long Range Planning, 37(5), 386.
Hines, A., & NetLibrary, Inc. (2002). Planning for Survivable Networks Ensuring Business Continuity.
...(download the rest of the essay above)