Home > Sample essays > Data Warehousing Solutions for McDonnel Clothing: A Design Study

Essay: Data Warehousing Solutions for McDonnel Clothing: A Design Study

Essay details and download:

  • Subject area(s): Sample essays
  • Reading time: 9 minutes
  • Price: Free download
  • Published: 1 April 2019*
  • Last Modified: 23 July 2024
  • File format: Text
  • Words: 2,770 (approx)
  • Number of pages: 12 (approx)

Text preview of this essay:

This page of the essay has 2,770 words.



Table of Contents

1. Executive Summary 3

2. Data Warehousing Solutions 4

3. McDonnel Data Warehouse Design 5

3.1. Star Schema 5

3.2. Sales Fact Table 6

3.3. Dimensions 6

3.3.1. Store Dimension 6

3.3.2. Time 7

3.3.3. Customer 7

3.3.4. Product 8

4. Data Warehouse Adaption to Business Problems 9

4.1. Key Customers 9

4.2. Most Profitable Products 9

4.3. Most Profitable Locations 10

4.4. Most Profitable Time Periods 10

5. References 11

Appendix 1 – Data Dictionary 12

Appendix 2 – SQL Code 16

Appendix 3 – Work Breakdown 18

Executive Summary

This report documents a study into Data Warehousing and its application to providing management assistance for the Sales business process for McDonnel Clothing, a clothing retailer based in Melbourne, Australia. The report includes a discussion of the benefits of data warehousing, description of the design of the data warehouse and how it can be used to address the key business issues.

Kimball’s four step approach has been used in the design of the data warehouse and a star schema design has been proposed. Section 3 of the report contains descriptions of the fact and dimension tables used in the star schema. The data dictionary is provided in Appendix 1 in addition to how data will be sourced from existing databases. SQL statements used in the creation of the data warehouse are shown in Appendix 2.

Key Objectives:

Design a data warehouse system that is aligned with deriving insights for decision-making in the business

Outline integration of existing information systems

Describe how the data warehouse can be used to answer key business issues

The existing information systems can easily be integrated into creating the data warehouse. In addition, the following improvements to the existing database systems have been recommended:

customer database system to include a customer version number and a ‘Date of First Order’ attribute

Product order database system to include a product version number

We have found that the proposed star schema design for the data warehouse can be used to answer the key business questions for McDonnel Clothing. It will help managers understand key customers, the most profitable products, store locations and time periods which and make decision regarding the future growth of the company. We recommend the implementation of the data warehouse for McDonnel Clothing to enable future growth.

Data Warehousing Solutions

In the era of competitive business environment, large amounts of data are being generated from each branch of the business. Intellectual capital is the main source of added market value to the company, created over time. Hence, it is crucial to discovery techniques for developing and leveraging intellectual capital. Data warehousing is the best way to leverage intellectual capital in today’s era. It allows the formatting of data from different fragments of business to one unified structure which is easy to understand (Katal, 2013, August).

According to (Inmon, 1993), a data warehouse is a subject-oriented (it analyses a particular subject area), integrated (collection of data from multiple data sources), time-variant (warehouses provide access to more detailed information over a longer period (Zeng, 2003)) and non-volatile (historical data is not altered) collection of data in support of management's decision-making process.

From the above definition, we know that a data warehouse presents complex data in a uniform way; giving a single integrated view of the whole organization, allowing data to be accessed quickly. This makes it easier to be understood by managers by minimizing the number of tables resulting in simplified queries leading to improved decision-making. This is possible because the data warehousing is based on the principle of chunking (human brain has limited capacity to process information) and hierarchical structuring (Moody, 2003).

A common goal for all businesses is to make better decisions in comparison to their competitors. That is only possible if a company has a data warehouse as it provides a standardized data from different sources which reduce the risk of error in interpretation (Herzing University, n.d.), reducing time wasted in accumulating data which leads to quick and accurate decision making, giving the company a competitive edge (Minerra, 2017).

Functions that become streamlined, efficient and qualitatively improved include:

Reporting: managers may analyse data and make reports more easily

Presentation of data:  It’s easier to make dashboards for critical management data.

Analysis and Implementation: Drilling down is easier and helps to distinguish problems and areas where reforms have to be implemented

Uniformity in Data Design: In communication, it is not just the content that matters. Equally important is the form. Data Warehouse and Data Warehouse managers can create this uniformity which increases ease of understanding and comfort of familiarity.

Security: it gives an organization a secure way to provide specific data to people who have a legitimate need and to exclude others. (Serra, 2013)

The creation of a data warehouse brings about a new orientation in the organization. It creates a different way of relating to and analysing the business problems. Data Warehouse emerges as a new well-informed member of the organization which pulls all deliberations to an empirical focus. It has an impact on the whole working environment.

McDonnel Data Warehouse Design

This section details the design of Star Schema created for McDonnel Clothing

Star Schema

The following design is based on the process stated by Kimball (Kimball, 2002).

The business process identified for the design of the star schema is Sales. Thus, we have specified the Sales Fact Table at the centre of the star schema – it is clear from the description, that to expand the business, increased understanding of sales, products and customers over different time periods is required.

For this business process, the definition of grain is transactional, capturing a line item on a sales order. That is, we shall use each transaction and the individual line items with those transactions.

The Dimensions chosen are Customer, Store, Product and Time.

In understanding Sales, we have chosen Unit Sales, Revenue and Margin as the business measures. The star schema for the Sales Process is given below.

Sales Fact Table

The Sales Fact Table is defined by the following attributes: Grain, Type of Fact Table and Type of Business Measures.

The grain of the Sales Fact Table is a line item on a Sales Order. The Sales Fact Table is a transactional table, listing the individual line items from sales of products made at a date, by a customer, from a particular store. The facts in the fact table are additive business measures since they can be summed across all dimensions. The business measures are:

Unit Sales for each line item is available from the Sales Order Table in the Sales System as “Qty”.

Revenue (or Dollar Sales) is a calculated value available by multiplying the Unit Sales (as above) by the Sale Price, given in the Sales Order Table of the Sales System.

Margin is a calculated value given as Unit Sales (given above), multiplied by the difference between Sale Price (Sales Order Table in sales System) and the Unit Cost given as ‘Cost per item $’ in the Product Order Table, within McDonnel’s Inventory System. Here, the chosen ‘Cost per item $’ shall be the amount which corresponds to the most recent transaction performed before the sale.

An example row of the Sales Fact table is shown below.

StoreID

ProductID

Prod

Version

TimeID

CustomerID

Cust

Version

Unit

Sales

Revenue

Margin

1

100018

1

12 May 2013

C478Carl

1

3

148.71

203.13

Dimensions

This section discusses each of the dimensions given in the Star Schema.

Store Dimension

A Store dimension has been chosen since the business would like to understand which store locations perform better than others. This is a simple table where a storeID identifies the address and PostCode of the store. Here the storeID is a surrogate key as it has not been included in the original data – it shall allow for efficient querying.

StoreID

Address

PostCode

1

Lygon Street, Carlton

3053

2

South Melbourne

3205

3

Williamstown

3016

Time

The Time dimension has been chosen because the company needs to know which time period is the most profitable. Thus, we have specified the time into year, quarter, month, day of month, and day of the week to better understand the sales performance under different time scales. Time ID is the unique date of that day.

TimeID

Year

Quarter

Month

Day

Day_of_the_week

19 Jan 2013

2013

1

Jan

19

Sat

13 May 2013

2013

2

May

13

Mon

Customer

The Customer dimension has been chosen to record the details of our customers. We decided to have CustomerID, version, names, date of birth, age group and postcode as the basic information. Customer is a Slowly Changing Dimension (SCD) and for this reason version has been included in the Customer Table.

The version points to the different suburb for a particular customer, because each customer may change their residential address over time. As such this is classified as a Type 2 SCD. In this fashion, the PostCode shall be updated at the time of the first order for the new version of the customer.  For this design of customer dimension, we would recommend that McDonell update their customer database system to include a customer version number and a date_of_first_order. To start with, all existing customers shall be assigned version 1 and any subsequent change of address would be updated as a new version, together with a corresponding date_of_first_order.

In order to understand how the age of customers have an effect on the business, we have segregated them into various age groups. Age group is a calculated attribute in the Customer dimension, based on the date of birth attribute.

CustomerID

Cust

Version

Name

DOB

Age_Group

PostCode

Date_of_First_Order

C478Carl

1

Brian Jones

10 Jan 1980

31-50

3053

12/5/13

C478Carl

2

Brian Jones

10 Jan 1980

31-50

3000

12/5/14

C478SthMelb

1

Nancy Smith

12 Dec 1959

Over 50

3205

15/5/14

Product

The Product dimension has been chosen to record the details of the products the company has sold.  As for the Customer dimension, the Product dimension has been identified as a Type 2 SCD.  As the Unit Cost of the product changes over time, the Unit Costs at various time periods can be tracked. This table contains productID, ProdVersion, product description, product group, order date and unit cost. The version shall be revised as the unit cost of a particular product is reordered at a new unit cost.

ProductID

ProdVersion

Description

Group

OrderDate

UnitCost

100011

1

6 Panel Sandwich Peak Cap (HW004) (Navy/Stone)

Headware

19 Jan 2013

6.5

101659

1

Ladies Maitre D Blouse (LL-MA) (White)

Lounge

7 Dec 2013

78

Whilst the value of Margin is calculated using the Unit Cost taken directly from the company’s existing databases, it is believed that tracking the Product Unit Cost over time will allow the company to understand why a product has become less profitable. That is, once the most profitable products have been identified, the manager may use the the Product Version in the Product dimension to further analyse the situation and perhaps change suppliers if suitable.

Data Warehouse Adaption to Business Problems

Based on the proposed data warehouse design, the following business questions can be answered by the managers of McDonnel Clothing.

Key Customers

Combing the customer dimension table and the sales fact table by Customer ID and joining the time dimension table with Time ID and product dimension table with Product ID from our new data warehouse, we can get all the orders and sales for each year, month and quarter for each customer. By summing up the Unit Sales grouped by Customer ID, we can calculate the total unit sales for each customer. And summing up the Revenue grouped by Customer ID, we can get the total dollar sales for each customer. Retrieving the margin from the sales fact table grouped by Customer ID, we can obtain the total margin for each customer. Combined with the time table with matching Time ID, we can view the unit sales, dollar sales and margin for each customer in each year, month and quarter. By sorting, we can show the managers the performance of sales for each customer in different time period and find the most profitable customers.

Combing the postcodes, which are retrieved from customer table, and sorting the margin in the order from largest to smallest, we can get the suburbs in which most profitable customers live. When a customer move his/her address, the customer table will update a new version of that customer without changing Customer ID and with the first order date of the new address. So, every sale will be associated with a particular postcode depending on when the order is being purchased. By comparing the Order Date from sales table with the First Order Date in customer table, we can know that the order made under which version of address.

By subtracting the year of today and year of birth, we can compute the age of each customer. By selecting and grouping the customers with ages less than 30, from 30 to 50 or over 50, we can summarize the unit sales, dollar sales and margins for each age group.

Most Profitable Products

For analysis of products with respect to various time periods, we need to join the sales fact table with the product table; we can do that by using the composite key which includes ProductID and ProdVersion. Similarly, we can join the sales fact table with the time dimension using TimeID. This will provide us with sales data across all products and time which can be used for analysis purpose. Next, we can group the transactions by ProductID and the chosen time interval. McDonnel can choose to group by individual months or years by specifying the month or year in a query.  For seasonal analysis, McDonnel can query the data by specifying the months corresponding to the chosen season.  For example, if McDonnel wishes to know for the Unit Sales, Revenue, and Margin for the summer season, a query can be run including the months December, January, and February.  In the same fashion, September, October, and November can be used to determine the business metrics for Spring.  Finally, sorting the sum of margin for each product will help McDonnel Clothes determine the most profitable products in different time periods.

In addition, we have included the unit cost of each product in the product dimension. This will allow the manager to analyse how changes in unit cost of products affect the margin over time. For example, if the margin of a particular product was decreasing over time, it can be compared to the change in unit cost to find out if that was the cause for the decrease in margin.

Most Profitable Locations

By combining the sales fact table and store dimension table using Store ID and joining the time dimension table using Time ID, the sales data for all stores across all time periods can be obtained. The sales data can be grouped by store ID and sorting by attributes such as unit sales, dollar sales and margin for each month can be analysed to determine which store location was the most profitable. Similarly, this analysis can be conducted over different time periods.

Most Profitable Time Periods

By combining the sales fact table and time dimension table using Time ID, the sales data across all time periods can be obtained. The joined table will contain data about the unit sales, dollar sales, and margins for various time periods such as daily, monthly, quarterly and yearly. This will allow the manager to analyse sales data across various time periods. For example, sum of dollar sales grouped by year will show how the total revenue has changed over the years. Also, sales data can be grouped for each day of the week (e.g. Sunday, Monday) to see how the performance varies over different days. This may help McDonell to identify if certain days of the week (e.g. Sundays) are more profitable and extend opening hours of the other stores.

References

1 Key Data. (n.d.). Data Warehouse Definition. Retrieved from https://www.1keydata.com/datawarehousing/data-warehouse-definition.html

Herzing University. (n.d.). What is Data Warehousing and Why Is It Important. Retrieved from Herzing University: https://www.herzing.edu/blog/what-data-warehousing-and-why-it-important

Inmon, W. (1993). Building the Data Warehouse,. New York: A Wiley QED publication.

Katal, A. W. (2013, August). Big data: issues, challenges, tools and good practices. In Contemporary Computing (IC3), Sixth International Conference on (pp. pp. 404-409). IEEE.

Kimball, R. R. (2002). The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd Edition. New York: John Wiley and Sons, Inc.

Minerra. (2017, August 15). What Is A Data Warehouse And How Does It Deliver Value? Retrieved from https://www.minerra.net/business-analytics/what-data-warehouse-how-deliver-value/

Moody, D. a. (2003). From ER models to dimensional models: bridging the gap between OLTP and OLAP design, Part I. Business Intelligence Journal, 7-24.

Serra, J. (2013, July 18). Why You Need a Data Warehouse. Retrieved from http://www.jamesserra.com/archive/2013/07/why-you-need-a-data-warehouse/

TechAdvisory.org. (2015, March 31). Retrieved from http://www.techadvisory.org/2015/03/benefits-of-data-warehouses-for-business/

Zeng, Y. C. (2003). Enterprise integration with advanced information technologies: ERP and data warehousing. Information Management & Computer Security.

About this essay:

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, Data Warehousing Solutions for McDonnel Clothing: A Design Study. Available from:<https://www.essaysauce.com/sample-essays/2018-4-13-1523598101/> [Accessed 15-04-26].

These Sample essays have been submitted to us by students in order to help you with your studies.

* This essay may have been previously published on EssaySauce.com and/or Essay.uk.com at an earlier date than indicated.