Task 1A
Introduction
This report will explore the given case for Happy Cruise Line and evaluate in their advantages also disadvantages of distributed database system for the cruise company. After discovering a various range of data distribution, the report will produce a combined resolution which will be included as a data distribution scheme.
Definition of Distributed Database:
A distributed database is a single logical database which is blowout along the computers in various location which are been connected over a computer network. [1]
Various types of data which is been organised and stored on a computer for quick exploration and retrieval is called a database.[2] ……. S.K Rahimi
Classification of Distributed Database:
There are primarily two types of database management software(DBMS),
1. Homogenous and
2. Heterogeneous
Homogenous: it has contained all the identical software and it is running by same DBMS for example, oracle DBMS for all the nodes (figure 1).
Heterogeneous: it is usually contains one or more various kind of DBMS, i.e. My SQL DBMS and oracle soft (figure 2).
Figure 1 Homogenous Distributed Database
[https://www.google.co.uk/search?q=homogeneous+distributed+database+system&biw=1366&bih=638&source=lnms&tbm=isch&sa=X&sqi=2&ved=0ahUKEwioqryFn8bQAhVpBsAKHWcRDtQQ_AUIBigB#imgrc=Nv4kvfxrL23ARM%3A]
Figure 2 Heterogeneous Distributed Database [https://www.google.co.uk/search?q=heterogeneous+distributed+database+system&biw=1366&bih=589&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjE-I-WoMbQAhWDIcAKHaP_DNQQ_AUIBigB#imgrc=6LT7RcjllwNNFM%3A]
Methods of Data Distribution:
A unification of methods which are been arranged to isolated the database to logical section is distributed data. These sections or logical elements could be placed at different locations for storage data. [4 R Elmasri]
Here three principle approaches of data distribution will be discussed below:
A. Data Replication
In distributed atmosphere Replication is one of the most widely studied phenomena. It is a approach in which multiple data is been stored at various sites. (Bernstein 1987). The reason behind for wide spreading attention because of these fact, increase availability, increased performance, and enhanced reliability.
Through storing the data at different site than one site, if a data system failed for any reason, a site could operate by using the replicated data, therefore growing availability and fault tolerance. Data replication hired both synchronous and asynchronous replication method to replacement the data along with all the nodes, above all asynchronous data method is frequently useable standers in the industry[J.A. Hoffer]. There are three key approaches of data replication are:
i. Snapshot replication
This technique generates a copy of the data at programmed intervals for example, daily, hourly, and weekly. This might be result an expiry database at times. In each site need to be maintained a complete copy for the whole database.
ii. Near-Real-Time replication
Whenever an update is being made in local database at the same time by using the actives remote databases are been updated. If there is broken communication by any reason, then require update is been placed for queue for being executed when the link is recovering [3 J.A. Hoffer]
iii. Pull Replication
This method is being explored above all of them are push strategies, however, there is pull strategy as well. Here is update prompt which is been controlled by the target node rather than source node. To update and request or the process of update queue, are being decided by the local node. There is a benefit, as pull synchronisation is fewer disruptive and takes places when it is necessary. [3 J.A. Hoffer]
B. Data Fragmentation
When a bunch of data in memory is being brooked up in many pieces which is not close together anymore is called by Data Fragmentation. [wiki]
Sometimes a DBMS split up from it’s global relation which is being isolated into multiple non-overlapping relations and dispensed to various nodes which are being called fragments. This procedure is familiar as a data fragmentation. [7 C. Ray]
Here are two types of data fragmentation will be described:
1) Horizontal Partitioning
Some rows of the table or relation are replaced from different site into the main relation [3]. For instance, Let’s have a look on the below table:
Table 1 Customer Table
Branch ID Customer Name Branch
1 Sebastian London
1 Symon London
2 Nicoleta Birmingham
3 Sandra Manchester
3 Adil Manchester
2 Sunny Birmingham
From the table number 1, depending on branch attribute we can fragment/divide the store and data by using DBMS for more faster way and accessing securely [Figure 3].
Branch ID Customer Name Branch
1 Sebastian London
1 Symon London
2 Nicoleta Birmingham
3 Sandra Manchester
3 Adil Manchester
2 Sunny Birmingham
Branch ID Customer Name Branch
3 Sandra Manchester
3 Adil Manchester
Branch ID Customer Name Branch
1 Sebastian London
1 Symon London
Figure 3: Horizontal Partitioning
2) Vertical Partitioning
Vertical partitioning split up a table into numerous tables containing less column. During this partitioning the selected columns are being divided depend on various and appropriate attributes and for keeping store improper relation of another nodes or section. For both section, Horizontal and vertical segment the data is being kept close to the node because of having frequent access of same data. [3]
C. Combination of together
For the need of business purpose, it’s possible to organize various change of the mentioned technologies. Regarding to the business needs and data type, the process could be implemented either synchronous or asynchronous. [3] Here is a comparison in-between various types of database strategies:
Table 2: Assessment of Distributed Database Design Strategies [3]
Solution for given scenario:
From the given circumstances study of Happy cruise Line, the succeeding Entity relationship Diagram (ERD) could be approached:
Figure 4: ERD diagram for the Happy Cruise Line
After a careful observation through the entities, would be recommended an amalgamation of horizontal partitioning and asynchronous pull replication strategy. From the given scenario it’s easily would understandable that Visit and Passenger specification of New York and LA office are being used as a primary manner. A screen shot has been taken of these tables could be partitioned by horizontally in each places rather than replicating them in every branches or putting them centrally. When necessary Miami and Huston branch could get easily access through these data from either New York or Los Angeles branches across with linking network when it’s necessary through Pull replication.
Table of port has been different level of access the request will be depended on which port data is being searched for, for example From New York branches the records of Atlantic oceans are being searched mostly. Therefore, depending on country attribute in port table a horizontal partitioning will going to possibly fit with it.
The Voyage table data has been searched from all the branches concurrently, hence for this table in all nodes an asynchronous snapshot replication is being endorsed. It has been demonstrated in the number of 5 below:
Task 1B
Data Distribution Scheme:
By depending on the solution which has been mentioned earlier above, a visual diagram has been created for Happy Cruise Line below:
Explanation of the given Scheme as follows:
New York Node: this will compass of horizontal data of both for passenger and visit table. It has contains port data for Atlantic Ocean as well. Different branches could get access through these data by asynchronous snapshot replication.
Los Angeles: this will compass as well of horizontal data of both for passenger and visit table. In this node will be available only port data for Pacific Ocean.
Miami: has been congregated port data for Caribbean sea and for Atlantic ocean as well
Huston Node: this section only contains Caribbean sea port data
Voyage, Ships and Cruise will be replicated to all the nodes through the asynchronous data by intermittently.
Conclusion:
In this report, has been portrait Distributed data and it’s methods along with to build up a distributed database system. The report generated a brief data distribution scheme for the given circumstances along with the comprehensive diagram.