Performance with a Distributed Database to Happy Cruise Line

Task 1A

Table of Contents

Introduction

This report will explore the given case for Happy Cruise Line and evaluate in their advantages also disadvantages of distributed database system for the cruise company. After discovering a various range of data distribution, the report will produce a combined resolution which will be included as a data distribution scheme.

Definition of Distributed Database:

A distributed database is a single logical database which is blowout along the computers in various location which are been connected over a computer network. [1]

Various types of data which is been organised and stored on a computer for quick exploration and retrieval is called a database.[2] ……. S.K Rahimi

Classification of Distributed Database:

There are primarily two types of database management software(DBMS),

1. Homogenous and

2. Heterogeneous

Homogenous: it has contained all the identical software and it is running by same DBMS for example, oracle DBMS for all the nodes (figure 1).

Heterogeneous: it is usually contains one or more various kind of DBMS, i.e. My SQL DBMS and oracle soft (figure 2).

Figure 1 Homogenous Distributed Database

[https://www.google.co.uk/search?q=homogeneous+distributed+database+system&biw=1366&bih=638&source=lnms&tbm=isch&sa=X&sqi=2&ved=0ahUKEwioqryFn8bQAhVpBsAKHWcRDtQQ_AUIBigB#imgrc=Nv4kvfxrL23ARM%3A]

Figure 2 Heterogeneous Distributed Database [https://www.google.co.uk/search?q=heterogeneous+distributed+database+system&biw=1366&bih=589&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjE-I-WoMbQAhWDIcAKHaP_DNQQ_AUIBigB#imgrc=6LT7RcjllwNNFM%3A]

Methods of Data Distribution:

A unification of methods which are been arranged to isolated the database to logical section is distributed data. These sections or logical elements could be placed at different locations for storage data. [4 R Elmasri]

Here three principle approaches of data distribution will be discussed below:

A. Data Replication

In distributed atmosphere Replication is one of the most widely studied phenomena. It is a approach in which multiple data is been stored at various sites. (Bernstein 1987). The reason behind for wide spreading attention because of these fact, increase availability, increased performance, and enhanced reliability.

Through storing the data at different site than one site, if a data system failed for any reason, a site could operate by using the replicated data, therefore growing availability and fault tolerance. Data replication hired both synchronous and asynchronous replication method to replacement the data along with all the nodes, above all asynchronous data method is frequently useable standers in the industry[J.A. Hoffer]. There are three key approaches of data replication are:

i. Snapshot replication

This technique generates a copy of the data at programmed intervals for example, daily, hourly, and weekly. This might be result an expiry database at times. In each site need to be maintained a complete copy for the whole database.

ii. Near-Real-Time replication

Whenever an update is being made in local database at the same time by using the actives remote databases are been updated. If there is broken communication by any reason, then require update is been placed for queue for being executed when the link is recovering [3 J.A. Hoffer]

iii. Pull Replication

This method is being explored above all of them are push strategies, however, there is pull strategy as well. Here is update prompt which is been controlled by the target node rather than source node. To update and request or the process of update queue, are being decided by the local node. There is a benefit, as pull synchronisation is fewer disruptive and takes places when it is necessary. [3 J.A. Hoffer]

B. Data Fragmentation

When a bunch of data in memory is being brooked up in many pieces which is not close together anymore is called by Data Fragmentation. [wiki]

Sometimes a DBMS split up from it’s global relation which is being isolated into multiple non-overlapping relations and dispensed to various nodes which are being called fragments. This procedure is familiar as a data fragmentation. [7 C. Ray]

Here are two types of data fragmentation will be described:

1) Horizontal Partitioning

Some rows of the table or relation are replaced from different site into the main relation [3]. For instance, Let’s have a look on the below table:

Table 1 Customer Table

Branch ID Customer Name Branch

1 Sebastian London

1 Symon London

2 Nicoleta Birmingham

3 Sandra Manchester

3 Adil Manchester

2 Sunny Birmingham

From the table number 1, depending on branch attribute we can fragment/divide the store and data by using DBMS for more faster way and accessing securely [Figure 3].

Branch ID Customer Name Branch

1 Sebastian London

1 Symon London

2 Nicoleta Birmingham

3 Sandra Manchester

3 Adil Manchester

2 Sunny Birmingham

Branch ID Customer Name Branch

3 Sandra Manchester

3 Adil Manchester

Branch ID Customer Name Branch

1 Sebastian London

1 Symon London

Figure 3: Horizontal Partitioning

2) Vertical Partitioning

Vertical partitioning split up a table into numerous tables containing less column. During this partitioning the selected columns are being divided depend on various and appropriate attributes and for keeping store improper relation of another nodes or section. For both section, Horizontal and vertical segment the data is being kept close to the node because of having frequent access of same data. [3]

C. Combination of together

For the need of business purpose, it’s possible to organize various change of the mentioned technologies. Regarding to the business needs and data type, the process could be implemented either synchronous or asynchronous. [3] Here is a comparison in-between various types of database strategies:

Table 2: Assessment of Distributed Database Design Strategies [3]

Solution for given scenario:

From the given circumstances study of Happy cruise Line, the succeeding Entity relationship Diagram (ERD) could be approached:

Figure 4: ERD diagram for the Happy Cruise Line

After a careful observation through the entities, would be recommended an amalgamation of horizontal partitioning and asynchronous pull replication strategy. From the given scenario, it’s easily would understandable that Visit and Passenger specification of New York and LA office are being used as a primary manner. A screen shot has been taken of these tables could be partitioned by horizontally in each place rather than replicating them in every branches or putting them centrally. When necessary Miami and Huston branch could get easily access through these data from either New York or Los Angeles branches across with linking network when it’s necessary through Pull replication.

Table of port has been different level of access the request will be depended on which port data is being searched for, for example From New York branches the records of Atlantic oceans are being searched mostly. Therefore, depending on country attribute in port table a horizontal partitioning will going to possibly fit with it.

The Voyage table data has been searched from all the branches concurrently, hence for this table in all nodes an asynchronous snapshot replication is being endorsed. It has been demonstrated in the number of 5 below:

Task 1B

Data Distribution Scheme:

By depending on the solution which has been mentioned earlier above, a visual diagram has been created for Happy Cruise Line below:

Explanation of the given Scheme as follows:

 New York Node: this will compass of horizontal data of both for passenger and visit table. It has contained port data for Atlantic Ocean as well. Different branches could get access through these data by asynchronous snapshot replication.

 Los Angeles: this will compass as well of horizontal data of both for passenger and visit table. In this node, will be available only port data for Pacific Ocean.

 Miami: has been congregated port data for Caribbean Sea and for Atlantic Ocean as well

 Huston Node: this section only contains Caribbean Sea port data

 Voyage, Ships and Cruise will be replicated to all the nodes through the asynchronous data by intermittently.

Conclusion:

In this report, has been portrait Distributed data and its methods along with to build up a distributed database system. The report generated a brief data distribution scheme for the given circumstances along with the comprehensive diagram.

Task 2A

Introduction: In this part, will be demonstrated about the dimensional modelling along with the Star Schema and Snowflake schema. Here also will be produced a star schema regarding to the given scenario and normalise it with the Snowflake schema.

Dimensional Modelling:

A Dimensional Modelling method in data warehouse design which is being database structured that is adjusted for online quires and data warehousing apparatuses. It is constructed with “fact” and “dimension” tables.[ http://data-warehouses.net/glossary/dimensionalmodel.html]

A “fact” is a numeric worth which is been wished to count or sum by the business. A “dimension” is a necessary access point to get the facts.

Every Dimensional model creates of one table along with the multiple keys, which are known as Fact table. Slighter tables are being called by Dimension table, every dimensional table contains a single part of primary key which is associated to the attributes in various keys in the fact table [8 R. Kimball]

Figure : Dimensional Model (Star schema)[google]

Data Warehouse and Dimensional Modelling:

A data warehouse is a traditional and relational database which is being constructed to look for query and analysis besides that transaction processing as well. It has been encompassing with historical data which is been originated from the transactional data, but it could keep up the data from any other different sources as well. It has been isolated analysis workload from transaction capacity and from several sources it would have capability to combine data together [https://docs.oracle.com/cd/B10500_01/server.920/a96520/concept.htm]

Furthermore, regarding to a relational database, a data warehouse has been organized with a few sections which could be an environment with an extraction, transportation, loading (ETL) solution, transformation, an OLAP (online analytical processing engine), tools for the client analysis, and so on for managing the data and carried to the business users.

Through the Usual Online Transaction Processing (OLTP) users or admin can do their daily transactional database and repossession requirements of operational data. It hasn’t got capability for processing a vast number of earlier transactions. Based on DBMS a simple multidimensional data cannot procedure by old-style SQL because of the reason of complex quires required to repossess the expected data from them. For instance, in all areas, “find whole sale of separate product and display conforming the total purchased online in physically” likewise in this complex real world the OLAP system is preferable [9 S.A Rahim]. On the other hand, the data warehouse which is being implemented by OLAP system has capability for processing and analyzation of a big number of past transaction or data entry which may vary of megabytes to petabytes [10 C. Pravu]

Many data have a nature with multidimensional manner, which is the crucial driver for the online transaction processing technology, and that would be happened in centrally to Data warehousing. By using multidimensional OLAP system, data entries from a warehouse can be manipulated dynamically. Online transaction processing system and Data warehouse is literally depend on a multidimensional model which is being consist of hierarchies, dimensions and procedures[11 S O].

Star Schema:

It’s a simplest form of dimensional model which is being consist of facts and dimensions. Being a Fact it has to be counted or measured, I.e. login or sale. A dimension has been created with reference information about the fact such as date or product. A star schema is being diagrammed by it’s surrounding every fact along with the associated dimensions [http://searchdatamanagement.techtarget.com/definition/star-schema]

Star schema has been created based on relational database, predominantly used in data warehousing [12 S O]. The diagram has been declared as a star due to a fact of Star schema closely look like a star in the galaxy with points encompassing from it centres. This schema has been considered as a plainest schema in data warehousing. A star schema has been give below:

In this schema, the centre fact table contains the quantitative procedures of unitary type (i.e. Sales value, number of unit has been sold or product from the above figure) associated with more multidimensional tables rather than one. Hence, to allow the system for avoiding and less efficiency which is mutual in duo wise problem, when looking for complex quires involve linking multiple tables to refer all the data necessary for providing the answer the queries [14 S O]

Snowflake Schema:

A snowflake schema is being referred as a logical procedure of various tables in a multidimensional database in which entity relationship diagram similar like a snowflake form. This schema has been represented by centralized fact table where the multiple dimensions is being connected [wiki]. Inside a snowflake schema the dimensional tables are being normalised for avoiding the redundancy [13 S0]. The dimensional tables in snowflake schema and star schema are identical though, this schema normalised the dimensional tables in third form of normalization[15 SO].

Task 2B

Millennium College with the Star Schema

From the given attributes, dimensions and case study, here is the following schema with it’s details:

CourseSection: CourseID, CourseName, Units, SectionNumber, RoomID, RoomCapacity.

Professor: ProfID, ProfName, Title, DepartmentID, DepartmentName.

Student: StudentID, StudentName, Major.

Period: SemesterID,

Star Schema for Millennium College:

Here is given Fact table with attributes being added for the schema to accomplish the dimensional model:

Name: Fact StudentRecords

Attributes:

 StudentID

 SemesterID

 CourseID

 ProfID

 Grade

Task 2C

Essay: Performance with a Distributed Database to Happy Cruise Line

Essay details and download:

Text preview of this essay:

Introduction

Conclusion:

About this essay:

Essay details and download:

Text preview of this essay:

Introduction

Conclusion:

About this essay:

Essay Categories: