Geographic information systems (GIS) are computer-based systems for the integration and analysis of geographic data. Geographic data are spatial data that result from observation and measurement of earth phenomena referenced to their locations on the earth’s surface.
Whenever public health professionals or epidemiologists use disease registries with address information, consider the locations of toxic waste disposal sites, or look at their quality and water quality reports from monitoring stations, they are working with geographic data.
1.1 Definitions of GIS
In part because GIS is an enabling technology, a consensus definition of GIS has been difficult to achieve. The acronym has several usages: ” as a technology, as a research field, and as a community .
In the 1990s, the term “GI Science was coined to distinguish the research field from the technology of geographic information systems. . Developments in GIS technology have clearly built upon and revived interest in theories and techniques of spatial analysis and cartography relevant long before the innovations in digital computing that made GIS possible.
Remote sensing is the analysis and interpretation of data gathered by mean that do not require direct contact with the earth. Aerial photography of the earth’s surface, taken with an aircraft-mounted camera, is an important source of up-to-date data. Aerial photographs can be scanned and rectified for printing or viewing on a display screen.
Digital processing for geographic data collection involves the use of satellites with sensors capable of detecting electromagnetic energy reflected on emitted from objects on the earth’s surface. The data are then enhanced viewing and analysis.
Scanning is a technique for capturing map data in digital form. Scanners use an optical laser or other electronic device to ” read ” a map and convert its features to a computer database of dark and light values.
With scanning, the map manipulation into digital format happens relatively quickly but the processing of the scanned image to recognize cartographic objects takes much more time.
Digitizing requires use of a tablet and cursor to record coordinate locations of map features from a map placed a digitizing tablet. It is also possible to a construct a GIS data layer by screen digitizing. This process is also known as heads-up digitizing, .
Computer-aided design (CAD) systems, as used with cartographic data, provide support for drafting and producing map-like displays of features of interest like roads or land parcels. As they originally developed, CAD systems like other computer graphics software made use of (X,Y).
Spatial data representation spatial analyses like map overlay have been implemented without the computer-based systems like GIS.
Review of Literature on Definition of GIS
Musa, Chiang, Sylk, Bavley, Keating, Lakew, Tsou and Hoven (2013) defined GIS as a computer system with the capacity to capture, store, analyze, and display geographically-referenced information. In other words, it is an informatics for storing and managing data that has been identified according to location.
Fradelos, Papathanasiou, Mitsi, Tsaras, Kleisiaris, and Kourkouta (2014) defined the GIS as spatial, digital data management systems that can integrate, store, adjust, analyze, and arrange geographically-referenced information. They added that these systems can be described as smart maps that offer a simulation of the real world to their users and which can generate interactive spatial or descriptive questions, analyze spatial data, and adapt and adopt them in analogue (printed maps and diagrams) or digital media (records of spatial data and interactive maps on the Internet).
Akeh and Mshelia (2016) defined the Geographic Information System (GIS) as a configuration of computer hardware, software, and data specifically designed to capture, store, analyze, manipulate, edit, retrieve, and display spatially-referenced data.
1.2 GIS Functions
Spatial Database Management
One important function of GIS is managing spatial data. Database management systems (DBMS) are used to store, retrieve, and manipulate data in a database. A CIS software product, like other computer software systems, is built on an underlying data model.
A data model is a detailed model that captures the overall structure of the data, independent of database management or implementation considerations . A data model includes the relevant entities, relationships, and attributes, as well as constraints defining how the data are used.
Spatial data embody complex and often hierarchical relationships that are not easily expressed in tables .
Spatial Analysis
GIS software systems enable public health analysts to do more than simply manage and map data. GIS support a range of spatial analysis functions. Spatial analysis refers to “a general ability to manipulate spatial data into different forms and extract additional meaning as a result” .
Specifically, spatial analysis comprises a body of techniques “requiring access to both the locations and the attributes of objects” . The results of a spatial analysis are “not invariant ” when locations of the objects being analyzed are changed. As such, spatial analysis covers a broad range of numerical methods.
The measurement functions of CIS allow the user to calculate straight-line distances between points, distances along curved paths and areas. Although the measurement functions are relatively few in number, they are extremely important. Distance as a measure of separation in space is key variable used in many other kinds of spatial analysis .
Topological analysis functions include the software functions used to describe and analyze the spatial relationships among units of observation. This category also includes spatial database overlay and assessment of spatial relationships across databases, including map comparison analysis.
The public health could identify the area within a specified distance of a public drinking water well or surface source, for example, and overlay the footprint of a building to determine whether or not building would be located far from the source to meet legal requirements.
Network analysis is a branch of spatial analysis that investigates flows through a network. The network is modeled as a set of nodes and the links that connect the nodes. Network analysis functions are also used to model service areas of facilities and to locate facilities.
Surface analysis techniques are often used to analyze terrain that represent a continuous surface. Filtering techniques include smooth and edge enhancement. Smoothing removes “noise” from the data to roader trends.
Review of Literature on Functions of GIS
Vinodhkumar, Sinha and Singh (2016) investigated use of GIS in veterinary science. They highlighted that it is a combination of computerized mapping technology and database management systems (DBMS). They also briefly discussed the varied functions of GIS and poinyed out that the difference between GIS and other databases or information systems is its spatial analysis functions. Most standard GIS software come with basic analytical tools that permit overlays of thematic maps (topological analysis), creation of buffers, etc., in addition to calculations of lengths and areas. Proximity analysis can be done through buffering, i.e., identifying a zone of interest around a point, line or polygon. Model building capability of GIS is very helpful for decision makers. Network analysis is another important analysis done through GIS. For example access routes can be determined by examining the entire field or attribute data linked to road map/spatial data. Furthermore, these researchers underlined the GIS function of disease mapping, which can be used to produce maps of disease incidence, prevalence, mortality, morbidity at the scale of a farm, or region, or country. Moreover, they discussed other GIS functions like geo-referencing, statistical analysis (uni-, bi-, and multi-variate analysis), multivariate), clustering of disease, and development of Generalized Additive Model (GAM), which is a type of Generalized Linear Model (GLM) that extends the GLMs by adding a smoothing function to account for geographical space.
Kirby, Delmelle and Eberth (2017) discussed the basic functions of GIS with reference to epidemiologic applications in Human Immunity Virus (HIV) research. The major related GIS functions which these researchers listed were storage and measurement of spatial relationships; geocoding; display of spatial relationships; analysis of attribute and feature data simultaneously; management of data from multiple sources; identification of spatial patterns; spatial modeling; and explanation of spatial patterns.
1.3 Public Health Applications of GIS
The hardware , software, and database developments that have brought other new users to GIS partly explain this diffusion into the public health sphere. Like many other organizations, public health agencies and public and private providers or insurers of medical care services manage large databases that contain geographic information that can be meaningfully integrated based on location.
The development health data GIS to data reflects , in part , lags in the availability of geocoded health data compared to other health ��� related GIS data bases.
As part of its Healthy People 2010 initiative. the U.S Department of Health and Human Services set objective 23-3 to increase proportion of major national , state and local health agencies that use geocodeding which would in turn promote the use of CIS at all levels.
The target level for meeting this objective was 90% of all public health data systems. At the time of the midcourse review in 2004, the use of geocoding in major health data systems had not increased significantly.
Not all health agencies or organizations have trained staff, software, and hardware necessary to apply GIS technology. Organizations developing a GIS for health analysis not necessarily require to the range spatial data even analysts GIS functionality. With Internet tools and software that can acquire the hardware with limited public health applications develop.
Review of Literature on Public Health Applications of GIS
Fletcher-Lartey and Caprarelli (2016) reviewed application of the GIS technology in the public health and identified its successes and challenges. They summarized the common uses of GIS technology in the public health sector, emphasizing applications related to mapping and understanding of parasitic diseases. They confirmed that geographical analysis has allowed researchers to interlink health, population, and environmental data, thus enabling them to evaluate and quantify relationships between health-related variables and environmental risk factors at different geographical scales. The GIS plays a major role in health care, surveillance of infectious diseases, and mapping and monitoring of the spatial and temporal distributions of vectors of infection, hence allowing researchers to study the relationships between spatial and temporal trends and risk and between environmental factors and health, to all scales Success examples include (i) The rapid epidemiological mapping of onchocerciasis in over 20 African countries; and (ii) the epidemiological application of data obtained from climate-based forecast systems that include observation of oceans, land elevations, land cover, land use, surface temperatures, and rainfall for disease surveillance and early-warning systems. The GIS has been successfully applied to the monitoring and prediction of parasitic diseases. The application of GIS in disease studies has furthered the understanding of the intersection between person, place, and time in infectious disease outbreaks and underlying social and cultural factors.
1.4 GIS and the Internet
Maps and geographic data have been an important category of online content since the introduction in popularity in the early 1990s of the WWW as a tool for accessing the Internet. In addition to the spatial database management, visualization and mapping, and spatial analysis functions that in-house GIS users call upon, publication and distribution of spatial are increasingly important.
Distributed GIS support four main applications:
Data sharing
Information sharing
Data Processing.
Location-based services.
Review of Literature on Public Health Applications of GIS
Gong, Simwanda and Murayama (2017) spotlighted that with the development of computational capability, mobile devices, and Internet technology, in the past half-century the method and carrier for GIS computing have advanced from the traditional mainframe and desktop (stand-alone) GIS systems to the Internet-based (distributed) GIS services. They cited a definition for Internet GIS stating that it is a network-centric GIS tool that uses the Internet as a primary means of providing access to distributed data and other information, disseminating spatial information, and conducting GIS analysis. They pointed that Internet-based GIS has been widely recognized, both in public and private organizations, as a fundamental tool for storage and distribution of data to targeted audiences (e.g., general users, decision makers, local communities, governments, and researchers). In this respect, various platforms have been developed in the fields of natural resource management and conservation (e.g., in enhancing public participation in wind farm planning and iceland conservation; fauna, flora, and plant landscape data management; and civil protection and emergency management). Other platforms focused on decision support (e.g., environmental sustainability; and natural hazards and risk management). These researchers designed an open Internet-based GIS platform called MEGA-WEB for data sharing, visualization, and spatial analysis of urbanization in 42 major Asian and African cities. This platform provides geodatabase of indicators for analyzing urbanization through the physical land surface conditions and their relationships with population, energy use, and the environment. Its targeted audience is mainly urban planners, decision makers, local communities, governments, and researchers.
Chapter 2
Spatial Data
Spatial data are observations with explicit locations. For geographers-and most people interested in studying human health problems-the relevant space is the surface of the earth. Geospatial data are obtained by observation and measurement of events or objects referenced to their location in that space.
ClS implementation requires access to geographic data. “The database is the foundation of a GIS.
2.1 Field and object Data
There are two main approaches to modeling geographic information. From one perspective, we can think of phenomena that are continuously distributed just above on or below the surface. Precipitation, surface elevation, and soil are examples of continuous data or field data.
It is possible to visit any location on the earth’s surface and ask “what is the elevation here?” or “what is the environmental quality here? This network of sites creates a spatial framework for describing the distribution of the phenomenon.
Tessellation is the geometric process of partitioning an area into smaller units that do not overlap but completely fill the entire area. Squares, triangles. and hexagons can be used as the basic units of a tessellation . When the units are the same shape and size, the tessellation is regular.
From the other perspective, we can think of discrete objects that may be found on the earth’s surface. A person, a hospital, and a public drinking water reservoir are examples of discrete data or object data. Many health databases contain object data.
Objects are entities that are identifiable, relevant to the particular public health problem at hand, and describable. It is possible to consider any object and ask ” where is this object located?’
Each object of interest that can be located on the earth’s surface has different types of attributes that are important for modeling purposes textual/numeric, spatial, temporal, and graphical.
Review of Literature on Field and Object Data
Adams and Gahegan (2014) pointed out that despite recent movement toward semantically-described services for spatial data infrastructures (SDIs), the scope and range of descriptive dimensions for geospatial data are underspecified. These researchers presented a diverse set of important dimensions that point to a series of challenges for data integration and then described how both traditional and emergent datasets can be characterized within these dimensions. They underscored that within the spatio-temporal frameworks of the model space, the harmonization of data faces the challenges of tessellation of the space, which can be continuous, or a regular grid, or an irregular grid. Other important spatiotemporal dimensions concern the projection, scale, and resolution of the data. On the other hand, representation of time can be continuous or discrete. In this respect, every state change is modeled as a binary property change in the data model, e.g., transforming from continuous value/data to categorical value/data. The dashed connections between the discrete space/field models and continuous space/object models have a step in between like discrete space/object model.
Ye (2016) investigated the geography of healthcare access and disparity issues in the USA using geospatial methods. He integrated spatial modeling, geo-statistics, and location problems in the GIS environment to investigate healthcare access. He underscored that the general assumption for use of location problems in health research is that the region of interest can be represented as a set of discrete spatial objects (points and polygons). This study has the contribution of introducing a hexagon tessellation (i.e., the basic units of tessellation were hexagons) to represent the study area and use their centroids as candidate facility sites to ensure more equal distribution of new facilities in space, which is a method that was not widely applied in previous location problem literature. The researcher was able to generate a population density surface and re-aggregate population information to designated hexagons. Regular hexagon tessellation ensures a more equal spatial representation of the study area and provides an alternative way of spatial representation.
2.2 Tessellation and Vector Data Models
The field and object views are expressed in the two main data models used to implement GIS: tessellation and vector. These data models have important implications for the storage and processing of data.
Perhaps the most commonly used irregular tessellation is the triangulated irregular network ( TIN) . A data value like elevation is observed at a set of sampling points on the surface. These points from the vertices of triangles in the TIN.
Network data are a type of vector data that model space as a set of connected links and modes . Network databases consist of nodes and arcs.
A node is a point of connection for two or more arcs or an endpoint of an arc.
An arc is a link that connects two nodes. The nodes and the arcs of the network comprise the entire space of interest and locations of interest exist only on the network.
In a planar network, a node exists whenever two arcs intersect. In a nonplanar network, arcs may cross each other without resulting in a node where arcs are connected. Ground transportation networks with overpasses and underpasses are examples of nonplanar networks.
Transportation network databases often include a turn table and a reference address table. The turn table includes a tuple or row for each direction of travel through an intersection from one segment to another segment.
An impedance measure is associated with each “turn” indicating whether or not it is possible to move from one street segment to another in a particular direction.
Turn and arc impedances provide some measure of the travel cost associated with travel in a particular direction. A reference address table stores information about address ranges for street segments.
Vector data that are not topological are sometimes referred to “spaghetti” data. In spaghetti data , lines and areas are independent features, like strands of spaghetti on a plate, and need not correspond to any a actual spatial object.
Review of Literature on Tesselation and Vector Data Models
Zandi (2013) reviewed the geo-computational methods for surface and field data interpolation. She stated that the key types of geographic data models are raster/grid, vector/geo-relational topological, network, Triangulated Irregular Network (TIN), and object data models. Surfaces can be represented using vector models such as contour or iso-lines and TINs or raster models. Raster data are applied in at least four ways: (i) models describing the real world; (ii) digital scans of existing maps; (iii) compiling digital satellite and image data; and (iv) automatic drawing driven by raster output devices. Raster data models are also commonly known as grid models. The vector data model is associated with the discrete object view and each object in the real world is classified into any of three geometric types: (i) point is recorded as single coordinate pairs, (ii) line as a set of ordered coordinates, or (iii) polygon as one or more line segments that form a polygon area. The coordinates may have two (x, y: latitude and longitude), three (x, y, z: latitude, longitude, and elevation) or four (x, y, z, m: latitude, longitude, elevation, and time) dimensions. The TIN is a vector-based representation of the physical surface that is made up of continuous, non-overlapping triangular and irregularly distributed nodes and lines in 3-D coordinates (x, y, and z). They are often derived from the elevation data of a rasterized DEM. Tessellation, or tiling, of the surface is a collection of plane figures that fill the surface with no overlaps and no gaps.
Tiwari (2017) conducted simulation of floods over an urban area using geo-visualization in 4-D environment. He indicated that GIS has two main kinds of spatial data formats: (i) vector data, which are a collection of geometries and consist of points, polylines, and polygons, and (ii) raster data, which contains information at the pixel level, e.g., the Digital Elevation Model (DEM), which contains the elevation data of a region on the surface of the earth.. All GIS systems support visualization for these two data formats as 2-D visualizations. For raster data, either grid-based or TIN-based (Triangulated Irregular Network) approach is used to create 3-D surfaces. For example, natural terrain surfaces can be rendered using TIN models from DEMs. This researcher used a Virtual Terrain Project (VTP). The VTP provides a platform for the development of tools for the easy construction of any part of the world in interactive and 3-D forms. He also employed a ���vtlib��� library, which is a C++ library that can render terrains from geospatial elevation data, and add other 3-D models on them. It stores the DEM data into its ���vtdata��� model and uses ���vtEngine��� to tessellate the elevation data to terrains.
2.3 Measuring Location
Location means position in space. Location is the basis for integrating geographic data in a GIS. Absolute location refers to position with respect to an arbitrary grid system like the geographic grid of parallels and meridians. Absolute location gives the position of a point so that its unique position on the earth is clear.
Relative location refers to position with respect to other objects in the geographic space. “The burner stack is 300′ northwest of the intersection of Park and Broad Streets” is a statement of relative location .
Positional data in a GIS can come from several sources. Geodetic, photogrammetric. and digital image processing data are all primary sources for positional data because positions are determined by direct or indirect measurement of the earth’s surface.
Digital image processing of geographic data relies on satellites with sensors capable of detecting electromagnetic energy reflected or emitted from objects on the earth’s surface . The energy detected is converted into a data value for a specific location and transmitted to receiving stations either directly or via tracking data and relay satellites.
The data are then enhanced for viewing and subsequent analysis by using digital image processing algorithms. It is possible to obtain a variety of data from a single flight. Acquisition of photogrammetric and digital image data usually involves purchasing a database from a government or quasi-governmental agency that has the means to produce these data.
In many GIS applications, positions are estimated from an existing map of the earth’s surface created at a particular scale. For example, we could estimate the coordinates of a hospital by digitizing from an appropriately annotated topographic map.
Digitizing requires the use of a tablet and cursor to record coordinate locations of map features from a map placed on the digitizing tablet or the use of the use of a cursor to screen digitize from a visual display .
Review of Literature on Measuring Location
Canavosio-Zuzelski, Agouris and Doucette (2013) adopted a photogrammetric approach to determining the positional accuracy of Open Street Map (OSM) road features using stereo imagery and a vector adjustment model. A rigorous photogrammetric approach was taken to determine the positional accuracy of OSM roads by using aerial stereo imagery, ground control points (GCPs), and a vector adjustment model to determine real world geo-locations for OSM shape points. As to the feature attributes, the truth vector represents a location that is spatially accurate, in an absolute sense, in relationship to the actual features on the ground because it was derived by high-order geodetic surveying techniques. Photogrammetry has traditionally been used to: (i) determine 3-D real world object space coordinates from stereo imagery; (ii) propagate the error present in the sensor imaging system to a ground location; (iii) derive three dimensional terrain surfaces, e.g., DEMS; or (iv) extend highly accurate GCPs to adjacent image strips or blocks to facilitate the accurate mapping of the terrain and infrastructure from aerial imagery. Of the most common ways to build vector datasets are digitizing, tracing, and vectorizing a raster aerial image (both satellite and airborne-based platforms). Since the resulting vector features will take on the coordinate reference frame of the aerial image, it is important to ensure the imagery is georeferenced properly. The positional accuracy of spatial objects can be defined through measures of the differences between the apparent location of the feature recorded in the database and its true location.
Mesas-Carrascosa, Garc��a, de Larriva, & Garc��a-Ferrer (2016) underlined that the algorithms used in traditional photogrammetry process overlapping images acquired from multiple viewpoints. Mainly, these techniques are based on imaging techniques called structure from motion (SfM). Photogrammetric processing is divided into four phases: (i) aerial triangulation; (ii) generation of Digital Surface Model (DSM); (iii) rectification of individual images; and (iv) orthomosaicking. These researchers defined spatial accuracy as the accuracy of the position of feature related to the Earth. It can be described in absolute or relative terms. Absolute positional accuracy is defined as closeness of the reported coordinate values to the values accepted as, or being, true. Relative positional accuracy is defined as closeness of the relative spatial positions of the features in a dataset to their corresponding relative spatial positions accepted as, or being, true. These researchers calculated the geolocation via aerial-triangulation. To improve the spatial quality of the results, several Ground Control Points (GCPs) were distributed over the study area. All check point locations were digitized on screen via the produced orthomosaics. This way, the positional data used in measuring location in the GIS environment made use of the absolute location data to calibrate the digital image processing and calibrate the relative location data and, consequently, improve the positional accuracy.
It is noticed that the three aforementioned studies derived their positional data from one or more of three primary sources of data; geodetic, photogrammetric, and digital image processing data, whereby positions are determined by direct or indirect measurement of the surface of the earth.
2.4 Scale, Projection, and Symbols of Cartographic Data Sources
As models, maps are generalized representations of reality . Maps distort reality by simplifying the complex, three-dimensional surface of the earth for representation on a flat sheet of paper or video screen.
Map scale tells the user how much smaller the map is than the reality it represents . map scale can be stated as a ratio , a simple bar graph , or a phrase .
The usefulness of the conformal projections is evident in the widely used State Plane Coordinate System in the United States. The State Plane Coordinate System provides a convenient means of locating mapping positions on a two dimensional plane. The grids permit the methods of plane surveying to be extended over great distances at high precision.
There are two true-scale lines running north and south in each transverse Mercator zone. Between these lines of secancy. The distances are less than true scale. Outside of these lines, the distances are greater than true scale.
Symbolization
The visualization and mapping functions of GIS require data objects to be represented with some kind of graphical symbol. Bertin (1979) identified six dimensions of visual variability of map symbols: size, shape, value, texture, orientation. These aspects of symbolization can be and are manipulated to achieve certain objectives in cartographic communication .
The range of symbols supported will vary from system to system, depending on software and hardware configurations. Standard cartography texts provide useful guidelines for map compilation and design.
Whenever public health professionals or epidemiologists use disease registries with address information. Geographic Data Quality Because the database is the foundation of any GIS, the quality of the geographic data that goes into the system is paramount.
Its draft report identified five important aspects of spatial data quality these dimensions have since been accorded a degree of the national consensus . To an extent, lineage is not so much a measure of data quality . Information needed to assess data quality based on other factors. The contents of a lineage describe data at various stages in its existence
Review of Plane, Projection, and Symbol
Ara��jo, Sluter and Camboim (2016) addressed visual variability of map symbols and proposed a new set of symbols for large-scale base maps. They shed light on that to achieve an efficient result in proposing the map symbols they developed a map design based on the theory of Cartography. Their first step was to set some premises as follow: (i) the set of map features and their related symbols must be defined by the theory of topographic mapping; (ii) the large-scale maps must be totally integrated to the local (Brazilian) standards for topographic mapping; and (iii) the decisions about symbols design must agree with the Brazilian standards for reference map symbols. They defined the steps of the methodology by map design theory for generating topographic maps as follows: (i) defining the cartographic features that must be in a large scale (1:2000) reference mapping of an urban area; (ii) establishing the meaning of every cartographic features based on the theory of topographic mapping; (iii) grouping the features into classes by their meaning and by the EDGV conceptual model; (iv) creating symbols for each kind of feature; and (v) applying the symbols to urban areas of the country/city of interest. These steps can serve as guidelines for researchers in other parts of the world.
Santos, Medeiros, dos Santos and Filho (2017) discussed accuracy assessment of geospatial data. They indicated that the concepts of quality, the kinds of uncertainty to depict, and the application of its principles are essential in evaluating the spatial dataset. They also cited, and elaborated on, the six elements of spatial data quality of the ISO 19157:2013 standard. These elements and their associated sub-elements are (i) positional accuracy (absolute accuracy; relative accuracy, and positional accuracy of the data in the grid); (ii) thematic accuracy (correction of classification, degree of correction of quantitative attributes, and accuracy of quantitative attributes); (iii) temporal accuracy (accuracy of a temporal measure, temporal consistency, and temporal validity); (iv) completeness (commission and omission); (v) logical consistency (conceptual consistency, domain consistency, format consistency, and topological consistency); and (vi) usability). Therefore, to meet the demands of mapping with satisfactory quality products and to keep track of technological developments, these and similar standards and minimum parameters should be adopted.
Chapter 3
Spatial Databases for Public Health
spatial data sets are fundamental components of GIS. The success of health related GIS projects depends critically on having access to accurate, timely, an compatible spatial data. For organizations embarking on GIS projects, spatial data can be viewed as both a cost and a resource.
Developing spatial data sets is expensive; it is estimated that well over half the cost of GIS projects goes to base creation, updating, and improvement.
Spatial data sets are often useful for addressing a wide range of policy a planning issues. Their value extends well beyond the scope of the original projects for which they were created, and it increases as the data sets are used.
3.1 Foundation Spatial Data:
Geodetic Control
Geodetic control is a system for registering location information to a set of well-defined points on the earth’s surface . It includes a set of survey monuments on the ground and a reference datum that gives geographic coordinates for those monuments based on our knowledge of the size and shape of the earth.
Digital Line Graphs
Vector data also provide a foundation for regional-scale GIS development . Digital Line Graphs are vector databases that show transportation lines, water bodies, political boundaries, and elevation contour lines . Unlike imagery, DLGs include attribute information. Attributes codes describe the physical and cultural characteristics of points, lines, and areas on the DLC.
DLGs are derived from the large- and intermediate-scale topographic maps created by the U.S Geological Survey. Large scale DLGs, generated from the 75 minute topographic maps, have been created for many areas of the united states.
The Geological Survey has updated its topographic map series through a procedure known as ” limited update”, focusing on features that are most likely to have changed such as roads and hydrography DO QQs from photography are the basis for limited update revisions.
TIGER / Line Data
Another form of vector foundation data , compiled at 1:100,000 scale is TIGER / Line Data . the topologically integrated geographic encoding and referencing ( tiger ) data set was developed for the 1990 census.
Choosing a foundation database
The foundation data sets described in this section each offer a unique set of advantages and disadvantages for public health GIS. They differ in scale, positional accuracy, and display of features, as well as in their master or vector structure. They are also evolving over time.
The choice among foundation data sets depends on the scale and scope of the project, the resources available for data creation, and the types and scales of other data sets to which the foundation data will be linked. Projects that are national or regional in scope are more likely to utilize intermediate scale foundation data such as satellite imagery, DLGs, and TIGER/Line data.
In contrast , studies of single communities or neighborhoods can take advantage of the detail and positional accuracy of cadastral data and DOQQs.
Review of Literature on Spatial Databases for Public Health
Cruz, Ganesh, Caletti and Reddy (2013) emphasized that the availability of a wide variety of geospatial datasets demands new mechanisms to perform their integrated analysis and visualization. They developed a semantic framework, GIVA, for geospatial and temporal data integration, visualization, and analytics. A geographic component in these data formats uses geodetic systems such as WGS84 and geometric objects (e.g., polygon, polyline). Before attempting to create geospatial mappings between data, they were translated into a common spatial data format. Geocoding is the process of assigning apposite geographic coordinates while geoparsing is the process of identifying a geographic context. The GIVA considered two components, one for visualization and the other to support analytic methods. The analytics component aims at providing the scientists with a suite of statistical models for spatial data exploration and multivariate analysis. Given a geographic region and a time interval, the GIVA addresses the problems of simultaneously (i) accessing several datasets and (ii) establishing mappings between the underlying concepts and instances using automatic methods. The right methods should consider several challenges like those that arise from heterogeneous formats, lack of metadata, and multiple spatial and temporal data resolutions. At the core of GIVA is its capability to deal with data, metadata, and their heterogeneity, by addressing the issues of (i) the wide variety of formats, both standardized (e.g., GML, KML, Shapefile, MapInfo TAB) and non-standardized (e.g., HTML tables and flat files); (ii) lack of metadata, which stems in great part from non-standardized formats; (iii) multiple spatial and temporal resolutions and geocoding schemes due to different data acquisition techniques (e.g., surveys for census data and sensing methods); (iv) different vocabularies and schemas, which are created by diverse organizations; and (v) data uncertainty.
Mushonga, Banda and Mulolwa (2017) reviewed recent literature on GIS in health care with particular emphasis on web GIS technologies and how they can aid in analyzing health care needs, access, and utilization to support in the planning and evaluation of new service locations as well as use of GIS in disease surveillance. Their study aimed at producing a web-based GIS that can be used to collect data from health facilities and, in turn, provide this data to public health administrators to support decision making. The study also focuses on creating a portal for public interaction with health facilities spatial information. This study demonstrates development of spatial datasets and shows how foundation spatial data can be managed and utilized in the healthcare sector.
3.2 Population Data
Foundation data create a platform for integrating spatial data layers that contain population and related health, social, and environmental information frequently used in health applications of CIS.
Review of Literature on Population Data
Lloyd, Sorichetta and Tatem (2017) underlined that recent years have seen substantial growth in openly-available satellite and other geospatial data layers, which represent a range of metrics relevant to global human population mapping at fine spatial scales. They also spotlighted that detailed and contemporary spatial datasets that accurately describe human population distribution can support the measurement of the impacts of population growth, the monitoring of changes, environmental and health applications, and the planning of interventions. The specifications of such data differ widely and therefore the harmonization of data layers is a prerequisite to constructing detailed and contemporary spatial datasets which accurately describe population distributions. Such datasets are vital to measure impacts of population growth, monitor change, and plan interventions. They described the production methodology of the World Pop, which is a project that has produced an open access archive of 3 and 30 arc-second resolution gridded data. Four tiled raster datasets form the basis of the archive: (i) Viewfinder Panoramas topography clipped to Global Administrative area (GADM) coastlines; (ii) a matching ISO 3166 country identification grid; (iii) country area; (iv) and slope layer. Further layers include transport networks, land cover, nightlights, precipitation, travel time to major cities, and waterways.
3.3 Health Data
This section describes some of the major types of health information that can be incorporated in GIS for health planning, evaluation, and research. The aim of this section is to introduce these data sets and highlight geographical issues that affect data use and integration in GIS.
Registration system Data
VITAL STATTETICs
Local governments in the United States and other countries routinely collect information on all births and deaths that occur in their jurisdictions . These vital records are an important source of spatial data for public health GIS.
Birth records document a wide range of conditions that affect newborn infants, including birthweight, gestational age, congenital malformations and obstetric procedures, along with the mother’s demographic and social characteristics and her use of prenatal services.
Review of Literature on Health Data
Ruiz and Sharma (2016) reviewed application of GIS in public health in India, which accounts for about 17% of the world population, stressing that the implementation of geospatial technologies and methods for improving health has become widespread in many nations. As regards the public health domain and the type of associated data, these researchers reported use of the following categories of data in the GIS environment, besides the demographic information: infectious diseases, bacterial and parasitic infections, influenza, sexually-transmitted infections, vector-borne disease, diarrhea, food-related disease, water quality, non-infectious diseases, pulmonary diseases, air quality, maternal and child health, nutrition, obesity, diabetes, trauma, and cancer. Some researchers used multiple data overlays and multivariate statistical analysis and demonstrated usefulness of a multilevel approach for geographical analysis of data at multiple spatial scales. One study even illustrated the use of visualization, data integration, spatial analysis, and spatial modeling for cancer research.
3.4 Making Population and Health Data Mappable.
In order to use population, health, health care data sets in GIS, the data sets must first be captured and linked to a foundation spatial database . Data capture is a complex process that draws on an ever increasing array of tools including scanning, downloading from the Internet, and entering data directly from the field via the Global Positioning system. This section focuses on the procedures typically used for capturing health information – address matching and joining.
Address Matching to Locate Health Events as Points
Health information is often georeferenced by street address. For example, we might have information on the residential addresses of people who died of breast , cancer, or the addresses of hospitals, health clinics, schools, or workplaces.
Using the process of address-match gecoding, we can convert each address to a point on a map. At its simplest, address matching involves comparison of two data sets , one containing the addresses of health events and the other a foundation database with its own address information.
Review of Literature on Making Population and Health Data Mappable
Photis (2016) showed that through utilization of basic GIS mapping capabilities, it is possible to create maps and spatial interpretations that are not complex and do not require special data, personnel or systems, yet that provide significant insight into the investigation, assessment, and improvement of health-related policy and planning issues. Health Geography can provide a spatial understanding of a population’s health, the distribution of disease in an area, and the environment’s effect on health and disease. Thematic mapping involves mapping of feature attribute characteristics (e.g., census variables like total population, number of hospitals, number of hospital beds, number of patients per prefecture, and median household income). Census data combined with GIS analysis makes it possible for policy officials to better understand unmet needs.
Chapter 4
Mapping Health Information
Preparing and displaying maps of health information are among the most important functions of public health GIS. GIS offer a flexible, computerized environment that facilitates mew forms of data exploration and analysis. One can easily pan across a map, zoom in an areas of interest, or query a database to examine areas or events of special concern.
Health information can be linked with social and environmental features to examine geographical associations. The map then, is just one of product of a process of exploring , viewing, and analyzing spatial information, There is no perfect map rather, each map one of an almost infinite array of possible representations of spatial information.
4.1 The Mapping Process
Advances in computer technology and GIS have fundamentally changed the process of maps were viewed primarily as tools communication . The main goal was to communicate information most effectively by carefully preparing a “finished” map.
Review of Literature on the Mapping Process
Juarez et al. (2014) demonstrated use of GIS to (i) establish a core that supports analysis of the complex interactions between health outcomes, disparities, and the environment; (ii) promote the use of trans-disciplinary models and analyses to increase knowledge about the complex relationships between health disparities and the environment; and (iii) use public participatory geographic information systems (PPGIS) to engage community stakeholders in the use of spatial data and interactive mapping to reduce health disparities. Their study showed that mapping health information can be used to visualize geographic patterns and temporal trends at a county level, generate hypotheses, and identify ���hot spots��� to guide further data collection efforts and the targeting of public health interventions. In addition to generating traditional static maps, the GIS supports data visualization tools for modeling spatial and temporal associations and relationships.
4.2 Represent Health Information
Representing Area Data
Health information is often available for areas -ZIP codes, states, or countries – that form a template for representation. Area health data are spatially “filtered” with respect to predefined zones and thus are depended on the zoning system used.
CHOROPLETH MAPPING
More commonly, area health data refer to rates or ratios or other statistics that apply to areas. In these situations, choropleth mapping is the preferred approach.
In a choropleth map, the data values that fall within a specific class interval are assigned a unique color, shade, or pattern. Differences in intensity are visible in the varying colors or patterns across the map.
Class Interval Selection
A key issue in choropleth mapping is the choice of class intervals. Changing the class interval scheme can fundamentally change how the map looks and the message it sends. Most GIS offer the mapmaker a range of options for defining class intervals. A common one is equal interval classification, in which the range of the data values (maximum value- minimum value) is divided into a fixed number of classes.
Each class represents an equal interval of possible data values. Although this method works well for some data distributions, it performs poorly if there are extreme values in a highly data distribution.
Choropleth maps and analytical results also vary with the number and sizes of areal units- the spatial scale of data. This is known as the scale effect. Small areas capture the underlying pattern of health events, showing fine-grained variation over space.
In contrast, large areas local differences, reducing the variation in values over space. A county-scale map cannot show differences among towns and neighborhoods, for example, and state-level data hide disparities across counties.
Review of Literature on Representation of Health Information
Wei, Tong and Phillips (2017) mentioned that choropleth mapping is an important exploratory spatial data analysis technique and has been extensively used to visually explore the spatial pattern of attribute distributions across a region. It is one of the most widely used methods to visually explore spatial distributions of demographic and socio-economic data. As an essential procedure in choropleth mapping is determining class intervals to suitably group spatial units. A variety of classification methods have been developed. Examples include interchange heuristic, class separability, natural breaks, equal interval, and quantiles. The idea of all these statistically optimal classification methods is to identify class breaks that give the highest within-class homogeneity so that spatial patterns can be best highlighted in choropleth mapping, though the criteria used to define homogeneity may vary in different studies. New classification schemes have also been developed for data with specific characteristics, such as head/tail breaks method for data with a heavy-tailed distribution and concentration-based classification scheme for rate data.
4.3 Viewing Health Information
The ability to visualize and explore health data interactively is a main advantage of GIS in public health analysis. A view is a graphic representation of data. It is the part of the computer display board that one can see on the computer screen.
The extent of the view is always less than or equal to the addressable space in a data set. Simply put, one cannot display a larger geographic area than one has data for.
In GIS, the spatial objects in the view and the tablets of attributes describing them can be directly linked. The analyst can access the two together and explore the relationships among attribute data in the table and the spatial representation of that data in the view.
Public health analysts will typically approach a database with two types questions. What are the health problems of interest, and where do they occur? Where are the places of interest, and what kinds of health problems occur there?
Many public health organizations are organized functionally to address specific kinds of health problems maternal and child health, infectious disease, or injury , GIS users in these settings will have already established .
Viewing by Attribute
Viewing by attribute starts with the characteristics of events and identifies those events on a map. In the simplest case, we identify the location of a single event. For example, assume that we have a database showing the locations of all motor vehicle collisions in Connecticut. Ono particular collision that resulted in a fatality is of interest.
A more complex operation is to select by attribute , identifying multiple records based on their common attributes.
We select events that have particular characteristics and display their locations on the view. For example, policymakers may want to know the locations of all motor vehicle collisions that involve pedestrians. We query the table for all collisions involving pedestrians, select those collisions, and their location will be identified on the view.
It is also possible to select events by attribute . For example, we might want to identify all motor vehicle collisions that occurred in Hartford, Connecticut, and involved pedestrians. The geographical query is to select the town of Hartford and the attribute query is to select only collisions involving pedestrians.
Geographical Viewing
Geographical viewing starts with a geographic areas of interest and asks about the attributes of events located within those areas. In a GIS view, the analyst can select locations, or pan and zoom to particular locations, and then examine the attribute data in the table for those selected events.
The map is the starting point, and the analyst links back to attribute information in the table. Using standard query tools in GIS, one can select features according to their point or area locations.
Changing the View
Views are not static. Within the limits defined by the scale and extent of the data set, one can change the view, moving across the map or focusing on areas of interest. Pan refers to movement across a map, bringing new areas into the view.
Often the view includes just part of the geographic extent of a data set . Using the pan function, we move across that geographic extent to bring another part of the map into the view. We can also change the view by zooming in or out.
When we zoom in to an area, we move toward it, keeping it in focus in the view . Zooming in is useful for getting a closer look at areas of special interest. By zooming in to a cluster of health events, the detailed geography of the disease cluster becomes apparent. Drawing upon other data layers, we can observe the concentration of events along roads. or in relation to parks, landfills, and other features.
Web mapping greatly expands the range of cartographic variable and map symbols beyond the set of visual variables . Animated map sequences showing spatial and temporal change are easily created in a web environment. The web also provides an ideal platform for three-dimensional mapping of health data.
Although web mapping has many advantages, map production and distribution may be limited by technological barriers. Despite increases in the size and resolution of computer screens, the screen still limits the portion of map that can be viewed and its level of detail. That portion will be determined by the size the screen and the area inside the web browser. If a map is larger than available space, viewers will have to scroll to see every part of the map.
Computer screens are raster devices. Screen resolution is generally much lower than the resolution of many desktop printers, which are capable of 1.200 dots per inch or dpi as a minimum. Viewers monitors may have resolutions ran from 60 to 100 dpi. The lower resolution also limits geographic detail, test size and shade patterns.
Review of Literature on Viewing Health Information
Carroll, Au, Detwiler, Fu, Painter and Abernethy (2014) underlined that the development of increasingly sophisticated GISs has provided a new set of tools for public health professionals to monitor and respond to health challenges. They allow public health professionals and researchers to integrate, synthesize, and visualize information pertaining to disease surveillance, prevention, and control. These systems can help pinpoint cases and exposures, characterize spatial trends, identify disease clusters, correlate different sets of spatial data, and test statistical hypotheses. Often, these analyses are aided by visualization and mapping of data provided via web services or a user interface. There are many approaches to delivering GIS functions based on various sources of public health data, including geocoding, integration of data sources, and cluster detection. Mapping of health data is commonly achieved through dot maps, choropleth maps, and isopleth (or gradient) maps. With respect to the visualization methods for GIS in public health, the simplest visualizations plot or aggregate spatial data to deliver static point or choropleth maps of individual or aggregate data. Some GIS or spatial statistical methods seek to perform kernel-based smoothing to estimate risk maps, visualize disease risk according to a statistical model, or compare one feature with another. While the ability to zoom and pan to navigate maps is a common interactive feature enjoyed by users, more advanced systems contain interactive controls to enable users to retrieve information about selected items or regions, visualize the results of arbitrary queries, control visualization options, control temporal ranges of data returned, or link displays of data with alternate or comparative visualizations.
Bui and Pham (2016) mentioned that use of spatial information becomes popular with the integration of GIS and statistics package in processing health data. In many cases, geo-referenced data are used in GIS, in combination with attribute data that describe the characteristics of disease locations. Based on that, visualization, exploration, and modeling can be carried out to assist in health decision making. Spatial location in maps can be defined by Geocoding tool that enriches a description of a location, most typically a postal address or place name, or matching column value of statistic table to attribute table of spatial data (shapefile). Geocoding provides position of points with a pair coordinate that facilitates spatial analysis in geographic information systems. Recently, efforts have been made to develop more active and dynamic systems and to make Web-based GIS more interactive for end users, such as statistical packages in health data processing and health information dissemination. The Web-based system provides fundamental tools for visualization and query, and for spatial analysis in some cases.