Essay: Data analysis

Essay details:

  • Subject area(s): Engineering essays
  • Reading time: 8 minutes
  • Price: Free download
  • Published on: September 4, 2019
  • File format: Text
  • Number of pages: 2
  • Data analysis
    0.0 rating based on 12,345 ratings
    Overall rating: 0 out of 5 based on 0 reviews.

Text preview of this essay:

This page of the essay has 2024 words. Download the full version above.


3.1 Introduction

In this chapter is discussed about methodology for this project. The process for the production of the final result of the project must be clearly understood of the project processing to the production of final outcome must be clearly understandable. There will give more effective way in order to give the best performance and best result.

For this study, there are have 4 phases that involved. The first phase is planned to identify and selection of the study area. In this phase, Klang, Selangor is chosen area because it is rapid industrial development in this city. The first phase is preparation of selection study area, second phase is a data pre processing with data processing, the third phase is data data analysis and last phase is map production.

3.2 Methodology


Phase 1


Phase 2


Phase 3


Phase 4

Figure 3.1 Detail Methodology

3.3 Selection of the Study

The study area is Klang, Selangor. Klang is the royal city. Klang was located in part of west coast Peninsular Malaysia and along the Straits of Malacca to the West and the Titiwangsa Mountains to the north and east. It is about 32 km to the west of Kuala Lumpur and 6 km east of Port Klang. The 13th busiest transshipment port and the 16th busiest container port in the world is Port Klang, located at Klang District.

3.4 Planning and Data Collection

After investigating the aim and objective of the study, the type of data collected must be implemented to solve the problem and achieve the objective. The appropriate data which involved in this study must be acquired.

3.4.1 Meteorology data

a. Wind

b. Rainfall

c. Temperature

3.4.2 Image Satellite

a. Landsat image in five period (different years)

3.4.3 Ancillary data

a. Population

b. Topography map

3.5 Hardware and Software

Hardware is described as a device that is used to perform tasks. Among the criteria of hardware is its capability to run multiple tasks without crush at one time. In this study will used software of ERDAS Imagine 2013, ENVI and ArcGIS. So, to run this software at least needed a random access memory (RAM) of 512 MB. Hardware and software are depending on the each other to complete the task. More powerful hardware is enables software to process the data transfer than usual.

3.5.1 ERDAS Imagine 2013

ERDAS IMAGINE’s geospatial data authoring system and the world’s leading, combining geospatial image processing and analysis, remote sensing and GIS capabilities into a single powerful package easily. ERDAS IMAGINE enables users to easily create value-added products such as 2D and 3D images, 3D fly through movies, and cartographic-quality map compositions from geospatial data.

Imagine is easy to learn and use for consumers and professionals alike. 2013 version of ERDAS IMAGINE add sophisticated tools to more expert users and establish a less complex version in future releases of ERDAS IMAGINE. ERDAS IMAGINE also provides advanced tools for parallel batch processing, spatial modeling, map production, mosaicking and change detection.

Imagine advantage can provide strategic value to a variety of industries including data providers, agriculture, forestry, natural resource management, telecommunications, environmental engineering and extractive industries.

3.5.2 ENVI

Easily process large data sets. The data collected by today’s sensors contain more information than ever before. In order to effectively read and extract information from these large data sets, need a software solution without file size limitations. ENVI works with any size data set and has automated tools to quickly and easily prepare big and small imagery for viewing or further analysis.

Read and analyze different data formats. ENVI supports over 70 data formats, including scientific formats such as HDF and CDF, image types like GeoTIFF, and additionally provides JITC compliant NITF support. ENVI delivers enterprise capabilities that provide quick and easy access to imagery from OGC and JPIP compliant servers within organization or over the internet.

Exploit information from different sensor types. ENVI supports imagery gathered from today’s popular satellite and airborne sensors, including panchromatic, multispectral, hyperspectral, radar, thermal and LiDAR. These sensors include ASTER, AVIRIS, AVHRR, Landsat, Quickbird, RadarSat, SPOT, TMS, DTED, WorldView and more.

3.5.3 ArcGIS 10.2

ArcGIS becomes a full part of the on-premises infrastructure of ArcGIS. In the past four years, Esri has been significantly developing ArcGIS Online deployed in the public cloud. At 10.2, this technology has been engineered into a fully supported product that can be easily deployed on premises and integrated with users’ existing technology.

This technology supports many capabilities:

1. Enterprise geospatial content management

2. Simple mapping

3. Esri Maps for Office

4. Integration with enterprise security

5. Sharing of data, maps, and apps

6. Groups

7. Dozens of apps

8. Open enterprise integration with Office, SharePoint, SAP, and others

3.6 Procedure

A procedure in this project consists of layer stacking, reprojection, image subset, radiometric correction, supervised classification and entering the algorithm start from pre-processing until processing the data. Pre-processing is a preparation phase to improve image quality as a basis for further analysis. Before proceeding to data analysis, need carried out an initial processing data to correct for any distortion due to the characteristics of the imaging system and image conditions.

3.6.1 Layer Stacking

After the images are downloaded from the USGS, the images are being saved in TIFF. format. Layer stacking must be done to ensure the images are being combined by all layers of bands into a single output file. In ERDAS IMAGINE, go to image interpreter and choose utilities. Then select layer stack.

Figure 3.2 Image in band 1 before layer stacking (left) and image in all band after layer stacking (right)

3.6.2 Reprojection

Reprojection is a georeferenced, orthorectified or geometrically calibrated image from its current projection system to a new projection system. In this project is a process to transform projection from WGS84 into RSO (Kertau). The reprojection is also performed to setting from false projection to the true projection for the satellite image. The datum for RSO is Kertau 1984 and the spheriod name is modified Everest. In ERDAS IMAGINE, go to tools, then image command tool.

Figure 3.3 Image before reproject to new system in WGS 84 (left) and image after reproject to new system in RSO Malaysia (right)

3.6.3 Subset Image

A subset of the image means an image cut into small part. The purpose of subset image is to choose the region of interest from the larger image. Thus, the subset image is focused more specific to the study area and clear area from cloud cover. So, to save processing time, disk space and paper, a subset is required. The images that have been subset will be processed on the next processing step. The image to be subset is referred on Topographic Map or Google Earth to ensure that the area required is the correct place. In Erdas Imagine, go to Data Prep then click Subset Image.

Figure 3.4 Image before subset (left) and image after subset (right)

3.6.4 Supervised Classification

Supervised classification is derived from human. In a supervised classification, the spectral characteristics of several known areas of land cover types are extracted from the image. These areas are known as a training area. Each pixel in the image is then classified as belonging to one of the classes depending on how close it is to its spectral characteristics of the spectral of field training.

Table 3.1 Class number and land cover type (USGS, 2014)

Class No. (Colour in Map) Landcover Type

1 (black) Clear water

2 (green) Dense Forest with closed canopy

3 (yellow) Shrubs, Less dense forest

4 (orange) Grass

5 (cyan) Bare soil, built-up areas

6 (blue) Turbid water, bare soil, built-up areas

7 (red) Bare soil, built-up areas

8 (white) Bare soil, built-up areas

Figure 3.7 Image to do supervised (left) and define training data in classification (right)

3.6.5 Accuracy Assessment

Accuracy assessment is the comparison of a classification with ground truth data to evaluate how well the classification represents the real world. Classification error occurs when a pixel (or feature) belonging to one category is assigned to another category. Accuracy assessment is performed by comparing the map created by remote sensing analysis to a reference map based on a different information source. In order to be compared, both the map to be evaluated and the reference map must be accurately registered geometrically to each other. They must also use the same classification scheme. Accuracy of image classification is most often reported as a percentage correct.

Figure Classification accuracy assessment report

Figure Overall classification accuracy

3.7 Algorithm Model

After pre-processing, other measures need to be taken with the processing steps comprises two steps modeler model uses an algorithm to detect the particulate matter (PM10). Data is classified from the higher, middle and lower, then the other is classified as a cloud because the software cannot separate which one the land and cloud by using the modeler model in Erdas Imagine software.

The digital numbers (DN) of the four visible bands (Band 1, Band 2, Band 3 and Band 4) of Landsat 8 OLI and three visible bands (Band 1, Band 2 and Band 3) of Landsat 5 TM were extracted corresponding to the locations of in-situ PM10 measurements and converted into radiance and then to reflectance. The algorithm for determined PM10 for Landsat 5 TM and Landsat 8 OLI are different.

3.7.1 Landsat 5 TM

In Landsat 7ETM+ was given by (Liu, etc al., 1996) as


Τr = aerosol optical thickness (Molecule)

Pr(θ) = Rayleigh scattering phase function

μv = Cosine of viewing angle

μs = Cosine of solar zenith angle

Assume that the atmospheric reflectance due to particle, Ra, was also linear with the τa of a factor, K0 . This assumption was reasonable because Liu, et al., (1996) also found the linear relationship between both aerosol and molecule scattering.

Atmospheric reflectance was the sum of particle reflectance and molecule reflectance, Ratm, (Vermote, et al., 1997).

Ratm = Ra + Rr


Ratm = atmospheric reflectance

Rp = particle reflectance

Rr = molecule reflectance

The optical depth was given by Camagni and Sandroni, (1983), as equation. From the equation, rewrite the optical depth for particle and molecule as equation

t = σps


τ = optical depth

σ = absorption

s = finite path

The result was extended to a three-band algorithm as equation below


A = Particle concentration (PM10)

Ratmi = Atmopsheric reflectance, i = 0, 1 and 3 are the band number

ej = algorithm coefficients, j = 0, 1, 2, … are then empirically determined

Form the equation, founded that PM10 was linearly related to the reflectance for band 1 and band 2. This algorithm was generated based on the linear relationship between τ and reflectance.

3.7.2 Landsat 8 OLI

Algorithm number 13 selected to be our proposed algorithm due to its highest correlation coefficient of (0.834) and lowest root mean square error (RMSE) value of (11.836) between the measured and calculated PM10 values. The accuracy and validation of proposed algorithm results were performed using PM10 ground measurements and calculated PM10 by our algorithm. The relationship between extracted spectral reflectance from Landsat 8 OLI satellite image with PM10 ground measurements were examined and investigated through correlation analysis.

Table 3.2 Regression results (R) and (RMSE) using different forms of algorithms. (*) Calculated PM10 by algorithms, (**) b1, b2, b3 and b4 are the reflectance values for band1,band2, band3 and band4 (Geophys Remote Sens, 2014)

Algorithm No.




1 PM10(*) = 2.26 b(**)1 – 2.267 0.799 12.87

2 PM10= 2.04 b2 – 4.406 0.785 13.263

3 PM10= 1.81 b3 – 17.728 0.802 20.986

4 PM10= 1.39 b4 – 13.099 0.77 18.772

5 PM10= 3.56 b1 – 1.17 b2 – 0.255 0.79 13.127

6 PM10= 1.21 b1 + 0.98 b3 – 12.903 0.83 20.986

7 PM10= 1.56 b1 + 0.51 b4 – 9.458 0.789 13.156

8 PM10= 0.64 b2 + 1.27 b3 – 15.226 0.805 20.986

9 PM10= 1.24 b2 + 0.60 b4 – 11.170 0.832 11.879

10 PM10= 4.36 b1 – 3.50 b2 + 1.51 b3 – 12.615 0.81 20.986

11 PM10= 0.41 b2 + 2.27 b3 – 0.66 b4 – 16.174 0.802 20.985

12 PM10= 0.99 b1 + 1.62 b3 – 0.45b4 – 13.481 0.817 20.986

13 PM10= 4.72 b1 – 4.19 b2 + 3.07 b3 – 1.02

b4 – 13.871 0.834 11.836

3.8 Data analysis

Last product will produce analysis from processing data. In data analysis will produce a landuse changes within three years on area study, Klang, Selangor. Then, generate the final map for PM10 using algorithm in three periods (different years). Besides that, it also shows the correlation between landuse development and PM10 in three years in graph form. It will relate the PM10 since landuse development in that year.

3.9 Summary

This entire chapter explains about the research flow work from early stage of the methodology. A lot of alternatives can be used in order to manage data collection and analysis tasks. The data and information about the PM10 are used in this project to create an analysis base on the aim and objectives of the research by using ERDAS Imagine software. In this research planning is very important thing to be done. Due to the time constraint, it is needed to be complete the work orderly.

About Essay Sauce

Essay Sauce is the free student essay website for college and university students. We've got thousands of real essay examples for you to use as inspiration for your own work, all free to access and download.

...(download the rest of the essay above)

About this essay:

This essay was submitted to us by a student in order to help you with your studies.

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, Data analysis. Available from:<> [Accessed 27-09-20].

Review this essay:

Please note that the above text is only a preview of this essay.

Review Title
Review Content

Latest reviews: