Fish Species Classification using Semantic Web Approaches

Abstract— This paper aims to collate data collected on fishes, using Lagos State as a case study. The collated data is then stored using a database model associated with the Semantic Web – triplestore. The data gathered on these fishes is processed by adding their locations in Lagos State, their binomial nomenclature, their common English names and also their images. In implementing the Semantic Web, an Ontology is created that holds all the data on the fishes using the Web Ontology Language, OWL provided by the Protégé platform. This Ontology can then be queried using an endpoint. This project makes use of an offline endpoint. The endpoint is a SPARQL Protocol and RDF Query Language, SPARQL. The SPARQL endpoint for this project is provided by the Apache Jena Fuseki server. The ontology is then loaded into the endpoint for storage and querying. Several queries are run to determine the effectiveness and output of the entire system.

Keywords— fish; species; semantics; classification; Lagos

I. INTRODUCTION

Lagos State has many water bodies. These vast water bodies contain a vast variety of sea creatures, of which fishes will be considered in this paper. As Lagos State is surrounded by these vast water bodies, it can be said that it is a wetland. The state has major water bodies in four of its five divisions, namely: Badagry, Epe, Ikorodu and Lagos Island.

Collecting data on fishes and storing them for both present usage and future purposes is important for the development and acquisition of knowledge. This paper will make use of the idea of the Semantic Web for proper processing, storage and presentation of the data collected on the fishes.

The term ‘Semantic Web’ was coined by head of the World Wide Web Consortium and founder of the World Wide Web, Tim Berners-Lee, to mean a web of data that can be processed by machines.

The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise and community boundaries [1].

Table of Contents

Problem statement

There are several issues that warrant the proposal of this paper’s topic and which the paper will address:

i. Improper classification methods employed in the organisation of gathered fish data. This paper would require a good knowledge of fishes and the steps involved in categorizing or classifying them. This is formally known as fish taxonomy.

ii. Presently, there is no computer-based fish classification system for fishes found in Lagos waters.

The remainder of the paper is organised as follows. Section II presents literature review. Section III discusses the method used. Semantic web technologies are presented in Section IV. Development of database for fish species is discussed in Section V. Discussion of the results is presented in Section VI and Section VII concludes the paper.

Literature review

Kutlu et al.[2] designed a multi-stage fish species classifications based on biometric points of the fishes. The system consists of data acquiring, pre-processing, feature extraction, classification and decision phases. Performances of the classification tasks are given in terms of recall, precision and specificity of each class and also average accuracy. The proposed method is able to classify fish into different species.

Drawback of this proposed system is that it does not consider certain similarities among the classes.

Huang et al. [3] presented a unique Balance-Guaranteed Optimised Tree with reject option for live fish recognition. The purposed system recognises the top 15 common species of fish and detects new classes in an unrestricted natural environment recorded by underwater cameras using features which are a combination of colour, shape and texture properties. The proposed method based on hierarchical classification achieves significant improvements compared to existing state-of-the-art techniques on a live fish image dataset in the literature.

In [4], the authors proposed a method of K-means and Hue Saturation Value (HSV) as a process of image segmentation based on the body colour of koi fish. The study focuses on the segmentation process of the body pattern of koi. They used K-Means method for the separation of the object and the background with two colour features. The method achieves high accuracy impact on the testing process by 97%.

Ogunlana et al.,[5] proposed a system that supports vector machine (SVM) that can analyse and classify fish species based to their characteristics by constructing an N-dimensional hyper-plane that optimally classifies fish species into categories. The hyper-plane is based on a predictor variable and a vector of predictor values which are the set of values assigned to the different fields in the dataset. The proposed method performs better compared to existing methods such as Artificial Neural Network, and K-mean clustering-based algorithms.

II. METHOD

One of the problems of fish identification is inadequate and incorrect classification. Such inadequacy of classification suggests the need for a proper taxonomy of fish. Taxonomy is defined as the science of the description and classification of organisms, essential to the inventory of life on earth [6].

Taxonomy makes use of several ranks or levels and sub-levels to identify and classify organisms. This paper makes use of the following seven ranks of taxonomy:

i. Kingdom

ii. Phylum

iii. Class

iv. Order

v. Family

vi. Genus

vii. Species

These ranks help in the proper identification of fishes.

Semantic Web Technologies

The term “Semantic Web” was first coined by Tim Berners-Lee to mean a web of data that can be processed by machines. The Semantic Web – also known as Web 3.0 – is deemed by many to be the future of the Internet. “The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”[7]. It is a source to retrieve information from the web (using the web spiders from RDF files) and access the data through Semantic Web Agents or Semantic Web Services.

The Semantic Web is implemented using several technologies:

• RDF – Resource Description Framework.

• OWL – Web Ontology Language.

• RDFS – Resource Description Framework Schema.

• SPARQL – SPARQL Protocol and Resource Query Language.

RDF is a framework created for representing data in the form of triples as shown in Figure 1.

As the name suggests, these triples consist of three parts: Subject, Object and Predicate.

Figure 1. RDF triple

RDF triples in file format are known as serialisations. Several serialisation syntaxes are:

i. RDF/XML

ii. TTL (TURTLE)

iii. N3

RDF/XML is the first standardised serialisation syntax but TTL is gaining popularity because it is easier to use. RDFS and OWL are technologies or languages that are used to define ontologies, classes, subclasses, domain, range and several other semantic properties. An ontology is a formal naming and definition of the types, properties and their interrelationships of the entities that exist for a particular domain of discourse [8]. SPARQL is the query language used to query the triples defined in serialisations.

III. DEVELOPMENT OF DATABASE FOR FISHES

Developing the database for fishes requires three steps:

i) Gather data on fish species in Lagos State: The data must include the name, classification, location and image of the fish.

ii) Create an RDF serialisation using the gathered data: The Web Ontology Language (OWL) was used to create the ontology which consists of the fish data and other data that will help to link data together.

iii) Query the ontology: Creating the ontology would require sufficient knowledge in Extensible Markup Language

(XML), but this paper makes use of Stanford’s Protégé to create the ontology. This software then creates an OWL serialisation with the selected syntax – in this case, XML as shown in Figure 2.

The standard querying language for Semantic Web data is SPARQL Protocol and RDF Query Language, SPARQL. The model of database used with Semantic Web is the triplestore.

This triplestore provides a SPARQL endpoint (interface for inputting queries and obtaining results) both locally and remotely.

The fish ontology consists of two classes and all subclasses of the class – Thing. The two classes are:

i. Fish

ii. Value:

• Taxonomy

• Others

The value class has two subclasses where the Taxonomy class contains the taxonomic description of the fishes. The ‘others’ class contains other fish properties such as

i. Habitat

ii. Location

iii. Image

Figure 2. Ontology class

The fish ontology consists of object properties and individuals and data properties. The object properties are properties that show the relationship between the subject and object of triples. In Figure 2, the object property of the triple was ‘hasFamily’. As with classes, the defined object properties are subsets of the object property ‘topObjectProperty’.

The following object properties were defined to elucidate the relationships existing between the fishes and their specific data as shown in Figure 3.

i. hasKingdom

ii. hasPhylum

iii. hasClass

iv. hasOrder

v. hasFamily

vi. hasGenus

vii. hasSpecies

viii. isLocated

ix. Habitat

x. Image

Figure 3. Object Properties

The individuals of the fish ontology are the entities that are assigned the above-mentioned object properties. The individuals are listed in Figure 4.

Figure 4. Ontology Individuals

IV. DISCUSSION

Some queries are run on the developed system to determine if it is working as it should.

A. Identifying All Fishes That Belong to Kingdom ‘animalia’

This query specified below is to determine the fishes that belong to the kingdom-animalia

SELECT ?fish

WHERE { fish fish:hasKingdom fish:Animalia }

Figure 5. Kingdom Query

The variable ?fish stores the fish English names that have the object property ‘hasKingdom’ and value Animalia. The hasKingdom property can be specified by stating its full URI that is

http://www.damiworks.bugs3.com/swt/fish.owl#hasKingdom.

This will take time and therefore the use of a prefix, in this case ‘fish’ that was assigned the URL

http://www.damiworks.bugs3.com/swt/fish.owl#

is appended to the property desired, in this case, that property is hasKingdom and also Animalia. The result for the above query is given in Figure 6.

Figure 6. Kingdom Query Result

Identifying the Fishes that are Located in the Epe Local Government region

The following query provides the fishes in Lagos State located in Epe Local Government region for this test as shown in Figure 7.

SELECT ?fish

WHERE { ?fish fish:isLocated fish:Epe }

Figure 7. Epe Fish Query

Figure 8. Epe Fish Query Result

The result of the query above is displayed in Figure 8.

V. CONCLUSION

The Semantic Web or Web 3.0 is the next generation of the internet. It is not so different from the World Wide Web (Web 2.0). Instead, it adds more functionality and meaning to the web. Hence, the name ‘Semantic’ Web. This outcome can be achieved via several technologies such as RDF, SPARQL, XML, RDFS and OWL.

Semantic Web technologies were used to create a database that stores data about fishes in Lagos waters. The fish data were gathered from students of Fisheries Department, Lagos State University and from the work of [9].

The Web Ontology Language was used to create the fish ontology for the paper. The Jena Fuseki server was used in the definition of the triplestore where the triples were loaded and stored and queried. The Fuseki server also provided an interface for querying this data. The fish ontology has been made available online and can be used in the description of other data (triples) just like the Friend of a Friend (FOAF) and the Dublin Core Ontologies.

There are several other fishes which were not properly processed and were therefore not included among the 20 distinct species of fishes in Lagos waters that were inserted into the triplestore.

The images of the fishes were unable to be displayed in the ontology created. The images can only be viewed by visiting the link that is given as a result of the image query of each fish.

The fish database created has not yet been made public.

This paper can be improved by working on the limitations listed above. Increasing numbers of triples of more fishes in Lagos waters can be added to the Semantic database or triplestore, making it wide enough for use in the Fisheries Department of Lagos State University and other universities in Lagos State, other states of Nigeria, and the world.

The SPARQL endpoint offered locally by the Jena Fuseki server can be made public just as the ontology has to allow the querying of the fish data globally.

In conclusion, the Semantic Web is an advancement in technology that aims to make sharing of information a lot easier. The Semantic Web can be applied to any area of study.

It is the dream of Tim Berners-Lee that the Semantic Web be accepted and employed in the implementation and improvement of the World Wide Web [10]. This paper has gone a considerable distance towards realising that dream.

References

[1] R. Peinl, “Semantic Web: State of the Art and Adoption in Corporations,” KI-Künstliche Intelligenz, vol. 30, pp. 131-138, 2016.

[2] Y. Kutlu, B. Iscimen, and C. Turan, “MULTI-STAGE FISH CLASSIFICATION SYSTEM USING MORPHOMETRY,” FRESENIUS ENVIRONMENTAL BULLETIN, vol. 26, pp. 1911-1917, 2017.

[3] P. X. Huang, B. J. Boom, and R. B. Fisher, “Hierarchical classification with reject option for live fish recognition,” Machine Vision and Applications, vol. 26, pp. 89-102, 2015.

[4] D. Kartika and D. Herumurti, “Koi fish classification based on HSV color space,” In Proceedings of Int’l Conf. on Information & Communication Technology and Systems (ICTS), Surabaya, Indonesia, 2016, pp. 96-100.

[5] S. Ogunlana, O. Olabode, S. Oluwadare, and G. Iwasokun, “Fish Classification Using Support Vector Machine,” African Journal of Computing & ICT, vol. 8, pp. 75-82, 2015.

[6] A. Hamilton, The evolution of phylogenetic systematics vol. 5: Univ of California Press, 2013.

[7] T. Berners-Lee, J. Hendler, and O. Lassila, “The semantic web,” Scientific american, vol. 284, pp. 28-37, 2001.

[8] T. Gruber, “Ontology. Entry in the Encyclopedia of Database Systems, Ling Liu and M. Tamer Özsu,” ed: Springer-Verlag, 2008.

[9] B. Mbawuike and E. Ajado, “Assessment of ornamental fish species and fishing methods in Ibiajegbende, Lagos State, Nigeria,” African Journal of Applied Zoology and Environmental Biology, vol. 7, pp. 23-27.

[10] N. Shadbolt, T. Berners-Lee, and W. Hall (2006). “The semantic web revisited,” IEEE intelligent systems, vol. 21, pp. 96-101.

Essay: Fish Species Classification using Semantic Web Approaches

Essay details and download:

Text preview of this essay:

Problem statement

Literature review

References

About this essay:

Essay details and download:

Text preview of this essay:

Problem statement

Literature review

References

About this essay:

Essay Categories: