Essay: AI Classifiers (Page 2 of 2)

Essay details:

  • Subject area(s): Computer science essays
  • Reading time: 11 minutes
  • Price: Free download
  • Published on: July 15, 2019
  • File format: Text
  • Number of pages: 2
  • AI Classifiers Overall rating: 0 out of 5 based on 0 reviews.

Text preview of this essay:

This page of the essay has 1137 words. Download the full version above.

Strategy for missing attributes values
In 131 different entries there are 161 missing values among 5 attributes which is 13.64% of the dataset, my strategy would be to delete all the missing values that’s occur in the dataset as “?” and replace them with “0” and also deleting the 131 rows from the dataset then allowing weka to handle the rest.
I used true positive rate and false negative rate and after analysing the results, the best option was to replace all the missing attribute to get a consistent result. I tried deleting all the entries with the missing value and that wasn’t the best choice as it represented 13.64% of the dataset.
Model
True Positive Rate
False Negative Rate
1
82.35
17.45
2
80.24
19.76
3
81.86
18.14
4
79.25
20.75

Main experimental series
For all experiments, a 10-fold cross validation strategy was used for validating the results.
I will be experimenting with both decision tree and artificial neural network.

Decision tree
In this section I used 16 models to train for this series of experiment. The experiments were performed to identify the influence of three crucial elements of the model.
• The best configuration of parameters.
• BI-RADS assessment attribute.
• using a prune tree or not.
I included MDLCorrection, BinarySplits, ColapseTree and SubtreeRaising (when pruned) parameters. I started all 8 model values on false and then changing one parameter to true and comparing them with each other. If the result was better than the previous, the parameter was left as true, otherwise it was returned to false.
This was applied for the evaluation of the pruning parameter and influence of BI_RADS assessment.
Model #
BI-RADS
Pruned
MDL correction
Binary splits
Collapse tree
Subtree raising
1
FALSE
FALSE
FALSE
FALSE
FALSE

2
FALSE
FALSE
TRUE
FALSE
FALSE

3
FALSE
FALSE
TRUE
TRUE
FALSE

4
FALSE
FALSE
TRUE
TRUE
TRUE

5
FALSE
TRUE
FALSE
FALSE
FALSE

6
FALSE
TRUE
TRUE
FALSE
FALSE

7
FALSE
TRUE
TRUE
TRUE
FALSE

8
FALSE
TRUE
TRUE
TRUE
TRUE

9
TRUE
FALSE
FALSE
FALSE
FALSE
FALSE
10
TRUE
FALSE
TRUE
FALSE
FALSE
FALSE
11
TRUE
FALSE
TRUE
TRUE
FALSE
FALSE
12
TRUE
FALSE
TRUE
TRUE
FALSE
TRUE
13
TRUE
FALSE
TRUE
TRUE
TRUE
TRUE
14
TRUE
TRUE
FALSE
FALSE
FALSE
FALSE
15
TRUE
TRUE
TRUE
FALSE
FALSE
FALSE
16
TRUE
TRUE
FALSE
TRUE
FALSE
FALSE
The main purpose of this experiment to the reduce the number od benign masses wrongly classified as malignant. it necessary to perform an invasive biopsy when an instance is classified as a malignant mass.

Model #
False positive
True positive
False positive rate
1
108
300
26.37%
2
116
343
25.17%
3
116
354
24.58%
4
124
360
25.52%
5
110
346
24.02%
6
106
347
23.30%
7
106
362
22.65%
8
103
362
22.15%
9
73
314
18.86%
10
77
347
18.16%
11
65
334
16.29%
12
65
334
16.29%
13
81
344
19.06%
14
69
346
16.63%
15
77
357
17.74%
16
62
337
15.54%

Looking at the table above model 16 achieved the lowest false positive rate of 15.54%. its configuration can be seen in the table below.

Metric
Model 1(after pre-processing)
Model 2 (before pre-processing)
Accuracy (%)
83.37
81.06
False Positive Rate (%)
15.0
16.0
True Positive Rate (%)
81.6
79.0

The MDLCorrecto and the Subtree Raising were deactivated while the BinarySplit and the ColapseTree parameters contributed to increase the performance of the model.
Its crucial to compare the top model with the prediction to know if it worth experimenting or using this model or not. It will not be useful to implement this models the false positive rate 11.59% is lower than the best decision tree model which is 15.54%.

Artificial Neural Network
16 models were created for the multilayer perception algorithm. The three different element influence were tested with it.
• the BI-RADS assessment attribute.
• The size of the network. Three configurations were evaluated here.
♣ A one-layer neural network.
♣ A deep network (many layers with few neurons each one).
♣ A big network (few layers with many neurons per layer).
• The values of the most important parameters of the NN: Learning rate and momentum.
Model #
BI-RADS
Number of layers
Number of neurons
Learning rate
Momentum
Number of epochs
1
FALSE
1
10
0.3
0.2
500
2
FALSE
1
10
0.3
0.2
1000
3
FALSE
1
10
0.1
0.2
1000
4
FALSE
1
10
0.1
0.2
2000
5
FALSE
1
10
0.3
0.4
1000
6
FALSE
1
10
0.3
0.7
1000
7
FALSE
1
10
0.3
0.4
5000
8
TRUE
1
10
0.3
0.2
500
9
TRUE
1
10
0.3
0.2
1000
10
TRUE
1
10
0.1
0.2
1000
11
TRUE
1
10
0.1
0.2
2000
12
TRUE
1
10
0.3
0.4
1000
13
TRUE
1
10
0.3
0.7
1000
14
TRUE
1
10
0.3
0.4
5000
15
TRUE
3
10
0.3
0.4
1000
16
TRUE
3
10
0.3
0.4
2000

It is crucial to note that the number of epochs can differ between models but its not considered a parameter so it not included as a part of the experiments. if the momentum increases or learning rate decreases, for example it would be crucial to increase the number of epochs as the training process might be slower. if the results improve it wont be because of the changes to the number of epochs but instead the change to the learning rate or the momentum rather than.

I will be using a different method to evaluate results because parameters of the multilayer perceptron algorithm are numeric and not Boolean.

Model #
False positive
True positive
False positive rate
1
110
354
23.71%
2
114
353
24.41%
3
107
353
23.26%
4
115
353
24.57%
5
110
360
23.40%
6
123
358
25.57%
7
117
363
24.38%
8
66
344
16.10%
9
69
340
16.87%
10
79
343
18.72%
11
79
355
18.20%
12
59
335
14.97%
13
71
337
17.40%
14
67
333
16.75%
15
82
347
19.11%
16
84
342
19.72%
17
92
343
21.15%
18
102
82
55.43%

When the learning rate parameter was decreased there wasn’t any improvement on the outcome of the neural network even when the number of epochs was increased as well. The learning rate was left at 0.3 instead. The best results were outputted when the momentum increased to 0.4 which means there was some local minima that couldn’t overcome with the momentum at 0.2.

According to the above, it can be concluded that both algorithms can reach almost the same level of accuracy, the best configuration for this experiment reached a false positive rate of 14.97% which has a difference of just a percentage point, in comparison to the best decision tree model.

Advance preprocessing

I will be implementing an advance preprocessing approach that will covert the dataset into a new once with difference graph. when I applied the principal component analysis algorithm to the dataset, all the attributes were defined as numeric with a mean of 0 and it went from 5 attributes to 11 attributes.

Whilst evaluating the performance of this algorithm using both decision tree and neural network I obtained a false positive rate that’s was s 18.53% to the decision tree and 18.66% to the artificial neural network. This means the performance did not improve and was the same as using just 5 attributes.

Conclusion
• The best decision tree had a false positive rate of 15.54%.

Decision tree diagram

• 14.97% was the lowest false positive rate achieved by any model in the neural network. The configuration achieved the rate with just one hidden layer (with 10 neurons), a learning rate of 0.3 and a momentum of 0.4

Artificial neural network

• I recommend the neural network over the decision tree as it has 14.97% false rate compared to decision tree 15.54%. There isn’t a massive difference but neural network had the better result.

REFERENCES

Scikit learn, n.d. Scikit Learn. [Online] 
Available at: http://scikit-learn.org/
[Accessed 04 2018].
Witten, Frank & Hall, 2011. Output: Knowledge Representation. In: Data Mining: Practical Machine Learning Tools and Techniques. s.l.:s.n., p. 64.
Witten, Frank & Hall, 2011. What’s It All About. In: Data Mining: Practical Machine Learning Tools and Techniques. s.l.:s.n., p. 13.
Research Gate. [Online] 
Available at: http:// www.researchgate.net/figure/Attributes-in-BI-RADS-mammographic-mass-dataset_tbl1_224136807
[Accessed 04 2018].
Science Direct. [Online] 
Available at: https://www.sciencedirect.com/science/article/pii/S1532046417300813
[Accessed 04 2018].
Karayiannis, N. & Venetsanopoulos, A. N., 2013. Artificial Neural Networks: Learning Algorithms, Performance Evaluations and Applications.. New York: Springer Science+Business Media.
Frontline Solvers. [Online] 
Available at: https://www.solver.com/xlminer/help/neural-networks-classification-intro
[Accessed 04 2018].
Meduim Corporation. [Online] 
Available at: https://medium.com/machine-learning-101/chapter-3-decision-tree-classifier-coding-ae7df4284e99
[Accessed 04 2018].

...(download the rest of the essay above)

About this essay:

This essay was submitted to us by a student in order to help you with your studies.

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, AI Classifiers. Available from:<https://www.essaysauce.com/computer-science-essays/ai-classifiers/> [Accessed 14-10-19].

Review this essay:

Please note that the above text is only a preview of this essay.

Name
Email
Review Title
Rating
Review Content

Latest reviews: