CLASSIFICATION OF POLYMER SAMPLES USING SVM

Table of Contents

Abstract

In the present world most of the objects area unit processed and most of them also created online. So the rapid growth in technology has led to the decrease in manual work and is creating most of the objects in various industries of machine-driven. One such automation requirement is found in the chemical industry where machine driven package is required for the classification of various kinds of plastics supported their absorbance values. One of the efficient algorithms used for classification is through support vector machines which provides a classification model that is trained and tested.

A solution to modify the sorting of various kinds of plastic by using the Fisher iris data set(which is a result of Near Infrared Spectroscopy (NIRS)). Plastics are everyday used non-biodegradable materials once not disposed properly have adverse effects on the atmosphere. For recycling of plastics totally different sorts of plastics (polymers) need to be known and separate. For economic reasons plastics must be known and sorted instantly. The Fisher Iris data set that can be employed by us is a result of NIRS. The NIRS technique has been used for the instantaneous identification of plastics. Measurements made by NIRS are quite accurate and fast. The necessary algorithm needed to process the NIRS data and to obtain information on the polymer category is written on the general purpose, high-level programming language Python as well as on MATLAB. In order to extend the efficiency of this process we also implement KS algorithm.

1. INTRODUCTION

Plastics are omnipresent and contaminating the atmosphere. Disposal of plastics has become a technological and social subject that has created and attracted a lot of attention from researchers, business people, politicians, environmental activists and the general public. One way to cut back the environmental pollution, owing to plastic waste comprising disposables and durables, is to recycle them. That is, to recover the used plastics from municipal or industrial wastes streams and convert them into new useful objects. Recycling of plastic-wastes is steady gaining importance as result of the efforts on conservation of oil resources and therefore the shortage of disposal sites. The plastic waste is separated into different material sorts by manual sorting, to obtain ecological materials of high worth. The purity of sorting fractions obtained in this way is not spare for direct utilization of pure polymers. Moreover, it is uneconomical and therefore the working conditions don’t seem to be solely only unpleasant however even dangerous to health. Considering the above difficulties, an automatic plastic sorting approach, involving automatic identification of materials followed by a mechanical sorting, appears as associate enticing different to manual sorting. NIR spectroscopy helps in identification of individual plastic sorts and offers a promising approach to waste sorting. Online review technology idealy makes use of NIR spectrometry, which is capable of in operating quickly. Spectroscopic ways are non-invasive and non disparaging measuring systems.

Plastic Waste Sorting System for Recycling of PET Materials ways offer spectral information from that meaning data will be quickly extracted for analysis. The detection speed should however, match the specific situation created by the sort and size of arriving parts. The identification is very correct in terms of its ability to distinguish different plastics, and is reasonably precise.

Present system of sorting is done either fully manually or mistreatment specialised machinery that is very expensive. It also needs lots of human resource for maintaining this method of sorting plastics. This manual process is not thus effective or efficient. It is highly prone to error. The cross verification of this manual process is once more extremely troublesome. So abundant of time is consumed in during this entire process. Hence associate machine-driven package which might kind the plastics supported the values obtained from NIRS is to be developed.

1.1OBJECTIVE

With this paper, we assist in the development of a plastic sorting technology using NIR (Near Infrared) Spectroscopy. Different sorts of plastics wiz, PPT, PVC etc. show different behavior once subjected to Near Infrared rays. This difference in behavior will be analysed to kind numerous kinds of plastics in pace and with negligible error. The technology has high value in the plastic utilization business along side several different similar industries.

In this paper, we are using algorithms such as Differential analysis and SVM to check regarding numerous sorting techniques in Machine Learning. Then we take the data set which contains the data of plastics behavior under NIR rays and develops a pattern behavior for each type. This will be done using SVM algorithm in MATLAB software.

For the second part, we convert this SVM algorithm developed using MATLAB to all-purpose, high-level programming language Python. The main idea of this project is to develop and to understand however on sorting of varied plastic sorts and then facilitate transfer this method for the utilization of industry/business.

1.2 PROBLEM DEFINITION

The plastic waste typically includes six sorts of materials particularly, polyethylene (PE), poly-ethylene teraphthalate (PET), poly-propylene (PP), poly-vinyl-chloride (PVC), high density polyethylene (HDPE) and polystyrene (PS). The experimental lab model classifies them and kinds PET alone. Through the existing set-up, PET materials can be sorted close to 100% with up to 200 kg per hour outturn. The maximum outturn is proscribed by the speed of the spectrograph utilized in the system. Higher throughputs up to 1 tonne/hr will be achieved by using high-speed spectrographs and quicker sorting routine.

So in order to extend the potency of sorting an automatic organisation is meant to be developed for the polymer samples. The proposed methodology ought to be in a position to take the associates analysis of NIRS graphs as input and will provide an output that classifies the polymers supported their absorbance values and different characteristics.

2. METHODOLOGY

The main plan of this paper was to develop a knowhow on sorting of various plastic varieties then facilitate transfer this methodology for the employment of trade. So the NIRS spectroscopy analysis is used to urge the dataset containing the polymer samples.

Before, we have a tendency to discuss the method of classification we tend to would like to offer a speedy report on NIRS spectroscopic analysis.

Near-infrared spectroscopy (NIRS) might be a qualitative technique that uses the near- infrared region of the spectrum (from regarding 800 nm to 2500 nm). Typical applications embrace pharmaceutical, medical diagnostics (including blood glucose and pulse oximetry), food and agrochemical quality control, and combustion analysis, similarly as analysis in helpful neuroimaging, medical specialty & science, elite sports coaching, ergonomics, rehabilitation, neonatal analysis, brain computer interface, medicine (bladder contraction), and neurology (neurovascular coupling).

Plastic resins live composed of a spread of compound varieties. Similarities within the size and form of the resins build them difficult to differentiate by sight alone. During this application note, close to infrared (NIR) spectroscopic analysis is used to sort coloured resins composed of assorted polymers. Diffuse reflection measurements square measure created within the NIR region to capture distinct spectral variations ensuing from the distinctive compound compositions, whereas avoiding the detection of spectral variations arising from resin color. During this application note, the utilization of NIR spectroscopy for distinctive coloured plastic resins is represented.

Figure 2.1: NIRS Spectroscopy

2.1 SVM

In machine learning, support vector machines (SVMs, also support vector networks) square measure supervised learning models with associated learning algorithms that analyze information and acknowledge patterns, used for classification and regression analysis. Given a collection of training examples, each marked as fit in to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the opposite, creating it a non-probabilistic binary linear classifier. An SVM model might be illustration of the examples as points in space, mapped so that the samples of the separate categories square measure divided by a transparent gap that’s as wide as potential. New examples square measure then mapped into the very same space and foreseen to belong to a class supported that aspect of the gap they fall on.

2.2 DISCRIMINANT ANALYSIS TECHNIQUES

There are two types of these techniques which are described as follows:

2.2.1 LINEAR DISCRIMINANT ANALYSIS

Linear discriminant analysis (LDA) could be a technique used in statistics, pattern recognition and machine learning to seek out a linear combination of features/options that characterizes or separates two or a lot classes of objects or events. The ensuing combination is additionally used as a linear classifier or, a lot of normally, for dimensionality reduction before later classification.

2.2.2 QUADRATIC DISCRIMINANT ANALYSIS (QDA)

Quadratic discriminant analysis (QDA) is closely associated with linear discriminant analysis (LDA), wherever it’s assumed that the measurements from every category square measure commonly distributed. Unlike LDA however, in QDA there’s no assumption that the variance of every of the classes is identical.

EXAMPLE OF FISHER IRIS

The Iris flower data set or Fisher’s Iris data set is a variable data set introduced by Sir Ronald Fisher (1936) as an example of discriminant analysis. It’s generally known as Anderson’s Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. Two of the three species were collected within the Gaspé Peninsula “all from an equivalent pasture, and picked on the equivalent day and measured at the equivalent time by the equivalent person with the equivalent apparatus”. The data set consists of fifty samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from every sample: the length and also the width of the sepals and petals, in centimeters. Supported the mix of those four features, Fisher developed a linear discriminant model to tell apart the species from one another.

Figure 2.2: Fisher Iris data set

Figure 2.3: Linear discrimination of Fisher Iris data set

Figure 2.4: Quadratic discrimination of Fisher Iris data set

2.3 TYPES OF CLASSIFIERS

Now we shall discuss about the various types of classifiers on which we are testing the data set.

Initially we implement the binary classifiers in python.

The classifiers that we are using to compare the efficiencies are Linear, Polynomial, RBF, Linear SVC

2.3.1 Linear Classifier

In the field of machine learning, the goal of applied math classification is to use an object’s characteristics to identify which class (or group) it belongs to .A linear classifier achieves this by creating a classification call supported the worth of a linear combination of the characteristics. An object’s characteristics are also known as feature values and are typically presented to the machine in a vector called a feature vector. Such classifiers work well for practical problems such as document classification, and more generally for problems with many variables (features), reaching accuracy levels comparable to non-linear classifiers while taking less time to train and use.

If the input feature vector to the classifier is a real vector, then the output score is

where could be a real vector of weights and f could be a function that converts the dot product of the two vectors into the specified output.

A linear classifier is commonly utilized in things wherever the speed of classification is a problem. Linear classifiers typically work fine once the quantity of dimensions in is massive, as in document classification, wherever every component in is often the quantity of occurrences of a word in a document In such cases, the classifier ought to be well-regularized.

Figure 2.5: Linear Classifier

2.3.2. POLYNOMIAL CLASSIFIER

A quadratic classifier is employed in machine learning and applied math classification to separate measurements of two or more classes of objects or events by a quadric surface. It is a more general version of the linear classifier.

Statistical classification considers a collection of vectors of observations x of an object or event, every of that includes a familiar sort y. This set is referred to as the training set. The problem is then to determine for a given new observation vector, what the most effective category ought to be. For a quadratic classifier, the proper solution is assumed to be quadratic within the measurements, therefore y set supported

xTAx + bTx + c

2.3.3 RBF CLASSIFIER

In the field of mathematical modelling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including approximation, time, classification, and system control.

• RBFs represent local receptors, as illustrated below, where each green point is a stored vector used in one RBF.

Figure 2.6: RBF Classifier

2.3.4 KENNARD STONE ALGORITHM SIGNIFICANCE

All the classifiers above defined are implemented in such a way that the training data and test data is split randomly and there is no particular way of splitting data by the user. So KS algorithm helps to split training and test data set separately by ranking the samples. KS algorithm ranks the data samples on the basis of their affinity to the support vectors and hence comes up with the best possible training set for the algorithm.

3. RESULTS

3.1 Linear Classifier

The figure:3.1 shows that Linear classifier classifies Training data with an accuracy of 97% and testing data with an accuracy of 88.8% giving one sample to be wrongly classified as type 4 when it is type 3 and also wrongly classifying another sample as type 3 when it is type 4 which is represented by confusion matrix.

Figure 3.1: Efficiency with Linear Classifier

3.2 Polynomial Classifier

The figure:3.2 shows that Polynomial classifier classifies Training data with an accuracy of 26.4% by wrongly classifying all samples to be type 1 and testing data with an accuracy of 11.1% by wrongly classifying all samples to be type 1 which is represented by confusion matrix.

Figure 3.2: Efficiency with Polynomial classifier

3.3 RBF Classifier

The figure:3.3 shows that Polynomial classifier classifies Training data with an accuracy of 95.5% by wrongly classifying one sample to be type 5 when it is type 4 and testing data with an accuracy of 77.7% by wrongly classifying three samples which is represented by confusion matrix.

Figure 3.3: Efficiency with RBF classifier

3.4 Results For Implementation In MATLAB:

3.4.1 Cross Validation

The figure 3.4 shows generation of testing data(66) and training data(20) using k-fold technique.

Figure 3.4: Cross validation with K-fold Technique

3.4.2 Multi Class Classification

The figure: 3.5 shows that Multiclass classifier classifies Training data with an accuracy of 79.49% by wrongly classifying 4 samples to be type 2 when it is type 3 and testing data with an accuracy of 75.0% by wrongly classifying two samples.

Figure 3.5: Efficiency with Multiclass classifier

Figure 3.6: Minimum number of training samples for which accuracy is 100%

The figure:3.6 shows Ranking the samples in order to generate efficient set of training and testing data and hence displays minimum number of training samples required such that accuracy is 1

4. Conclusion and Future Scope

In this paper, we implemented Support Vector Machine algorithm for separation of different classes of polymers. The absorbance values of these polymers under NIR spectroscopy were collected to train and test the classifier in the algorithm. First binary classification was applied, the data was then cross validated as well subjected to Kennard Stone algorithm. The accuracy achieved was 100% without cross validation and varied between 70%-80% with cross validation and after the application of KS algorithm.

A multiclass classifying algorithm was found fairly efficient when implemented in MATLAB as well as Python. In MATLAB, the accuracy showed varied results from 75%-90%. Whereas in Python, accuracy achieved with cross validation and with linear classifier was close to 95%.

There is scope for further improvements such as implementation of KS algorithm to the Python code as well as application of various pre-processing routines to cancel out noise from the data.

References

[1]. Multiclass and Binary SVM Classification:Implications for Training and Classification Users, an IEEE paper published by A. Mathur and G. M. Foody.

[2].Fast SVM Training Algorithm with Decomposition on Very Large Data Sets, an IEEE paper published by Jian-xiong Dong, Adam Krzyzak, and Ching Y. Suen.

[3].Extreme Learning Machine for Regression and Multiclass Classification, an IEEE paper published by Guang-Bin Huang, Hongming Zhou, Xiaojian Ding, and Rui Zhang.

[4].N. Cristianini and J.S. Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge,U.K.: Cambridge Univ. Press, 2000.

[5] H. Drucker, C. J. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support vector regression machines,” in Neural Information Processing Systems 9, 528 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 2, APRIL 2012 M. Mozer, J. Jordan, and T. Petscbe, Eds. Cambridge, MA: MIT Press, 1997, pp. 155–161.

[6] G.-B. Huang, K. Z. Mao, C.-K. Siew, and D.-S. Huang, “Fast modular network implementation for support vector machines,” IEEE Trans. Neural Netw., vol. 16, no. 6, pp. 1651–1663, Nov. 2005.

[7] R. Collobert, S. Bengio, and Y. Bengio, “A parallel mixtures of SVMs for very large scale problems,” Neural Comput., vol. 14, no. 5, pp. 1105– 1114, May 2002.

[8] C.-W. Hsu and C.-J. Lin, “A comparison of methods for multiclass support vector machines,” IEEE Trans. Neural Netw., vol. 13, no. 2, pp. 415–425, Mar. 2002.

[9] J. A. K. Suykens and J. Vandewalle, “Multiclass least squares support vector machines,” in Proc. IJCNN, Jul. 10–16, 1999, pp. 900–903.

[10] T. Van Gestel, J. A. K. Suykens, G. Lanckriet, A. Lambrechts, B. De Moor, and J. Vandewalle, “Multiclass LS-SVMs: Moderated outputs and coding-decoding schemes,” Neural Process. Lett., vol. 15, no. 1, pp. 48–58, Feb. 2002.

[11] Y. Tang and H. H. Zhang, “Multiclass proximal support vector machines,” J. Comput. Graph. Statist., vol. 15, no. 2, pp. 339–355, Jun. 2006.

[12] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme learning machine: A new learning scheme of feedforward neural networks,” in Proc. IJCNN, Budapest, Hungary, Jul. 25–29, 2004, vol. 2, pp. 985–990.

[13] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme learning machine: Theory and applications,” Neurocomputing, vol. 70, no. 1–3, pp. 489–501, Dec. 2006.

[14] G.-B. Huang, L. Chen, and C.-K. Siew, “Universal approximation using incremental constructive feedforward networks with random hidden nodes,” IEEE Trans. Neural Netw., vol. 17, no. 4, pp. 879–892, Jul. 2006.

[15] G.-B. Huang and L. Chen, “Convex incremental extreme learning machine,” Neurocomputing, vol. 70, no. 16–18, pp. 3056–3062, Oct. 2007

[16] www.python.org

[17] https://pythonprogramming.net/linear-svc-example-scikit-learn-svm-python/

[18] http://in.mathworks.com/help/matlab/

[19] http://in.mathworks.com/help/stats/classificationlearner-app.html

[20] http://in.mathworks.com/solutions/machine-learning/

Essay: CLASSIFICATION OF POLYMER SAMPLES USING SVM

Essay details and download:

Text preview of this essay:

Abstract

References

About this essay:

Essay details and download:

Text preview of this essay:

Abstract

References

About this essay:

Essay Categories: