Aim : To
Implement GUI program on Delta , Epsilon and N
DataSet 1:-
Diabetes Data Set
Data Set
Information:
Diabetes patient records were obtained
from two sources: an automatic electronic recording device and paper records.
The automatic device had an internal clock to timestamp events, whereas the
paper records only provided "logical time" slots (breakfast, lunch,
dinner, bedtime). For paper records, fixed times were assigned to breakfast
(08:00), lunch (12:00), dinner (18:00), and bedtime (22:00). Thus paper records
have fictitious uniform recording times whereas electronic records have more
realistic time stamps.
Diabetes files consist of four fields per record. Each field is separated by a tab and each record is separated by a newline.
File Names and format:
(1) Date in MM-DD-YYYY format
(2) Time in XX:YY format
(3) Code
(4) Value
The Code field is deciphered as follows:
33 = Regular insulin dose
34 = NPH insulin dose
35 = UltraLente insulin dose
48 = Unspecified blood glucose measurement
57 = Unspecified blood glucose measurement
58 = Pre-breakfast blood glucose measurement
59 = Post-breakfast blood glucose measurement
60 = Pre-lunch blood glucose measurement
61 = Post-lunch blood glucose measurement
62 = Pre-supper blood glucose measurement
63 = Post-supper blood glucose measurement
64 = Pre-snack blood glucose measurement
65 = Hypoglycemic symptoms
66 = Typical meal ingestion
67 = More-than-usual meal ingestion
68 = Less-than-usual meal ingestion
69 = Typical exercise activity
70 = More-than-usual exercise activity
71 = Less-than-usual exercise activity
72 = Unspecified special event
Attribute
Information:
Diabetes files consist of four fields
per record. Each field is separated by a tab and each record is separated by a
newline.
File Names and format:
(1) Date in MM-DD-YYYY format
(2) Time in XX:YY format
(3) Code
(4) Value
File Names and format:
(1) Date in MM-DD-YYYY format
(2) Time in XX:YY format
(3) Code
(4) Value
DataSet 2:-
Car Data Set
Data Set
Information:
Car Evaluation Database was derived
from a simple hierarchical decision model originally developed for the
demonstration of DEX, M. Bohanec, V. Rajkovic: Expert system for decision
making. Sistemica 1(1), pp. 145-157, 1990.). The model evaluates cars according
to the following concept structure:
CAR car acceptability
. PRICE overall price
. . buying buying price
. . maint price of the maintenance
. TECH technical characteristics
. . COMFORT comfort
. . . doors number of doors
. . . persons capacity in terms of persons to carry
. . . lug_boot the size of luggage boot
. . safety estimated safety of the car
Input attributes are printed in lowercase. Besides the target concept (CAR), the model includes three intermediate concepts: PRICE, TECH, COMFORT. Every concept is in the original model related to its lower level descendants by a set of examples (for these examples sets see [Web Link]).
The Car Evaluation Database contains examples with the structural information removed, i.e., directly relates CAR to the six input attributes: buying, maint, doors, persons, lug_boot, safety.
Because of known underlying concept structure, this database may be particularly useful for testing constructive induction and structure discovery methods.
CAR car acceptability
. PRICE overall price
. . buying buying price
. . maint price of the maintenance
. TECH technical characteristics
. . COMFORT comfort
. . . doors number of doors
. . . persons capacity in terms of persons to carry
. . . lug_boot the size of luggage boot
. . safety estimated safety of the car
Input attributes are printed in lowercase. Besides the target concept (CAR), the model includes three intermediate concepts: PRICE, TECH, COMFORT. Every concept is in the original model related to its lower level descendants by a set of examples (for these examples sets see [Web Link]).
The Car Evaluation Database contains examples with the structural information removed, i.e., directly relates CAR to the six input attributes: buying, maint, doors, persons, lug_boot, safety.
Because of known underlying concept structure, this database may be particularly useful for testing constructive induction and structure discovery methods.
Attribute
Information:
Class Values:
unacc, acc, good, vgood
Attributes:
buying: vhigh, high, med, low.
maint: vhigh, high, med, low.
doors: 2, 3, 4, 5more.
persons: 2, 4, more.
lug_boot: small, med, big.
safety: low, med, high.
unacc, acc, good, vgood
Attributes:
buying: vhigh, high, med, low.
maint: vhigh, high, med, low.
doors: 2, 3, 4, 5more.
persons: 2, 4, more.
lug_boot: small, med, big.
safety: low, med, high.
DataSet 3:-
Glass Data Set
Data Set
Information:
Vina conducted a comparison test of her
rule-based system, BEAGLE, the nearest-neighbor algorithm, and discriminant
analysis. BEAGLE is a product available through VRS Consulting, Inc.; 4676
Admiralty Way, Suite 206; Marina Del Ray, CA 90292 (213) 827-7890 and FAX:
-3189. In determining whether the glass was a type of "float" glass
or not, the following results were obtained (# incorrect answers):
Type of Sample -- Beagle -- NN -- DA
Windows that were float processed (87) -- 10 -- 12 -- 21
Windows that were not: (76) -- 19 -- 16 -- 22
The study of classification of types of glass was motivated by criminological investigation. At the scene of the crime, the glass left can be used as evidence...if it is correctly identified!
Type of Sample -- Beagle -- NN -- DA
Windows that were float processed (87) -- 10 -- 12 -- 21
Windows that were not: (76) -- 19 -- 16 -- 22
The study of classification of types of glass was motivated by criminological investigation. At the scene of the crime, the glass left can be used as evidence...if it is correctly identified!
Attribute
Information:
1. Id number: 1 to 214
2. RI: refractive index
3. Na: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10)
4. Mg: Magnesium
5. Al: Aluminum
6. Si: Silicon
7. K: Potassium
8. Ca: Calcium
9. Ba: Barium
10. Fe: Iron
11. Type of glass: (class attribute)
-- 1 building_windows_float_processed
-- 2 building_windows_non_float_processed
-- 3 vehicle_windows_float_processed
-- 4 vehicle_windows_non_float_processed (none in this database)
-- 5 containers
-- 6 tableware
-- 7 headlamps
2. RI: refractive index
3. Na: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10)
4. Mg: Magnesium
5. Al: Aluminum
6. Si: Silicon
7. K: Potassium
8. Ca: Calcium
9. Ba: Barium
10. Fe: Iron
11. Type of glass: (class attribute)
-- 1 building_windows_float_processed
-- 2 building_windows_non_float_processed
-- 3 vehicle_windows_float_processed
-- 4 vehicle_windows_non_float_processed (none in this database)
-- 5 containers
-- 6 tableware
-- 7 headlamps
Bayes Classifier:
1.
BayesNet:
weka.classifiers.bayes
Class BayesNet
java.lang.Object
weka.classifiers.bayes.BayesNet
All Implemented
Interfaces:
java.io.Serializable, java.lang.Cloneable, Classifier, AdditionalMeasureProducer, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, Drawable, OptionHandler, RevisionHandler, WeightedInstancesHandler
Direct Known
Subclasses:
public
class BayesNet
Bayes Network learning using various
search algorithms and quality measures.
Base class for a Bayes Network classifier. Provides datastructures (network structure, conditional probability distributions, etc.) and facilities common to Bayes Network learning algorithms like K2 and B.
Base class for a Bayes Network classifier. Provides datastructures (network structure, conditional probability distributions, etc.) and facilities common to Bayes Network learning algorithms like K2 and B.
2.
NaiveBayes:
Class
NaiveBayes
java.lang.Object
weka.classifiers.bayes.NaiveBayes
All
Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, Classifier, Aggregateable<NaiveBayes>, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler
Direct Known
Subclasses:
public class
NaiveBayes
implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler, Aggregateable<NaiveBayes>
Class for a Naive Bayes
classifier using estimator classes. Numeric estimator precision values are
chosen based on analysis of the training data. For this reason, the classifier
is not an UpdateableClassifier (which in typical usage are initialized with
zero training instances) -- if you need the UpdateableClassifier functionality,
use the NaiveBayesUpdateable classifier. The NaiveBayesUpdateable classifier
will use a default precision of 0.1 for numeric attributes when buildClassifier
is called with zero training instances.
3.
NaiveBayesMultinomial:
Class
NaiveBayesMultinomial
java.lang.Object
weka.classifiers.bayes.NaiveBayesMultinomial
All
Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, Classifier, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler
Direct Known
Subclasses:
public class
NaiveBayesMultinomial
Class for
building and using a multinomial Naive Bayes classifier.
The core equation for this classifier:
P[Ci|D] = (P[D|Ci] x P[Ci]) / P[D] (Bayes rule)
where Ci is class i and D is a document.
P[Ci|D] = (P[D|Ci] x P[Ci]) / P[D] (Bayes rule)
where Ci is class i and D is a document.
4.
NaiveBayesMultinomialText:
Class
NaiveBayesMultinomialText
java.lang.Object
weka.classifiers.bayes.NaiveBayesMultinomialText
All
Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, Classifier, UpdateableBatchProcessor, UpdateableClassifier, Aggregateable<NaiveBayesMultinomialText>, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, WeightedInstancesHandler
public class
NaiveBayesMultinomialText
implements UpdateableClassifier, UpdateableBatchProcessor, WeightedInstancesHandler, Aggregateable<NaiveBayesMultinomialText>
Multinomial
naive bayes for text data. Operates directly (and only) on String attributes.
Other types of input attributes are accepted but ignored during training and
classification
Valid
options are:
-W
Use word frequencies instead of binary bag of
words.
-P <# instances>
How often to prune the dictionary of low
frequency words (default = 0, i.e. don't prune)
-M <double>
Minimum word frequency. Words with less than
this frequence are ignored.
If periodic pruning is turned on then this is
also used to determine which
words to remove from the dictionary (default
= 3).
-normalize
Normalize document length (use in conjunction
with -norm and -lnorm)
-norm <num>
Specify the norm that each instance must have
(default 1.0)
-lnorm <num>
Specify L-norm to use (default 2.0)
-lowercase
Convert all tokens to lowercase before adding
to the dictionary.
-stopwords-handler
The stopwords handler to use (default Null).
-tokenizer <spec>
The tokenizing algorihtm (classname plus
parameters) to use.
(default: weka.core.tokenizers.WordTokenizer)
-stemmer <spec>
The stemmering algorihtm (classname plus
parameters) to use.
If set, classifier is run in debug mode and
may output additional info to the console
-do-not-check-capabilities
If set, classifier capabilities are not
checked before classifier is built
(use with caution).
4.
Naïve Bayes Multinomial
Updatable:
Class
NaiveBayesMultinomialUpdateable
java.lang.Object
weka.classifiers.bayes.NaiveBayesMultinomialUpdateable
All
Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, Classifier, UpdateableClassifier, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler
public class
NaiveBayesMultinomialUpdateable
Class for
building and using a multinomial Naive Bayes classifier.
The core equation for this classifier:
P[Ci|D] = (P[D|Ci] x P[Ci]) / P[D] (Bayes rule)
where Ci is class i and D is a document.
Incremental version of the algorithm.
P[Ci|D] = (P[D|Ci] x P[Ci]) / P[D] (Bayes rule)
where Ci is class i and D is a document.
Incremental version of the algorithm.
5.
Naïve Bayes Updatable:
Class
NaiveBayesUpdateable
java.lang.Object
weka.classifiers.bayes.NaiveBayesUpdateable
All
Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, Classifier, UpdateableClassifier, Aggregateable<NaiveBayes>, BatchPredictor, CapabilitiesHandler, CapabilitiesIgnorer, CommandlineRunnable, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler
public class
NaiveBayesUpdateable
Class for a Naive Bayes classifier using estimator classes. This
is the updateable version of NaiveBayes.
This classifier will use a default precision of 0.1 for numeric attributes when buildClassifier is called with zero training instances.
This classifier will use a default precision of 0.1 for numeric attributes when buildClassifier is called with zero training instances.
Bayes Classifier:
1.
BayesNet:
Classifier
|
Precision
|
Recall
|
F-Measure
|
Class
|
DataSet 1
|
0.795
0.639
|
0.816
0.608
|
0.806
0.623
|
tested_negative
tested_positive
|
DataSet 2
|
0.918
0.676
0.645
0.938
|
0.959
0.706
0.290
0.462
|
0.938
0.690
0.400
0.619
|
unacc
acc
good
vgood
|
DataSet 3
|
0.653
0.767
0.250
0.000
0.750
0.538
0.897
|
0.886
0.605
0.059
0.000
0.692
0.778
0.897
|
0.752
0.676
0.095
0.000
0.720
0.636
0.897
|
build wind
float
build wind
non-float
vehic wind
float
vehic wind
non-float
containers
tableware
headlamps
|
Weighted
Avg. DataSet-1
|
0.741
|
0.743
|
0.742
|
|
DataSet-2
|
0.854
|
0.857
|
0.849
|
|
DataSet-3
|
0.695
|
0.706
|
0.686
|
Confusion Matrix:
DataSet 1
|
DataSet 2
|
DataSet 3
|
a b <-- classified as
408 92 |
a=tested_negative
105 163
| b=tested_positive
|
a b
c d <-- classified as
1160 49
1 0 | a = unacc
104 271
9 0 | b
= acc
0 47 20
2 | c
= good
0
34 1 30 | d
=vgood
|
a b c d
e f g <--classified as
62 5 2 0 0 1 0 | a=build wind float
21 46
1 0 3
4 1 | b = build wind non-float
11
4 1 0
0 1 0 |
c = vehic wind float
0
0 0 0 0 0 0
| d = vehic wind non-float
0
3 0 0
9 0 1 |
e = containers
0
1 0 0
0 7 1 |
f = tableware
1
1 0 1
0 0 26 | g = headlamps
|
2.
NaiveBayes:
Classifier:
Naive
|
Precision
|
Recall
|
F-Measure
|
Class
|
DataSet 1
|
0.802
0.678
|
0.844
0.612
|
0.823
0.643
|
tested_negative
tested_positive
|
DataSet 2
|
0.917
0.672
0.633
0.931
|
0.960
0.706
0.275
0.415
|
0.938
0.689
0.384
0.574
|
unacc
acc
good
vgood
|
DataSet 3
|
0.455
0.481
0.190
0.000
0.333
0.571
0.857
|
0.729
0.171
0.235
0.000
0.308
0.889
0.828
|
0.560
0.252
0.211
0.000
0.320
0.696
0.842
|
build wind
float
build wind
non-float
vehic wind
float
vehic wind
non-float
containers
tableware
headlamps
|
Weighted
Avg. DataSet-1
|
0.759
|
0.763
|
0.760
|
|
DataSet-2
|
0.852
|
0.855
|
0.847
|
|
DataSet-3
|
0.496
|
0.486
|
0.453
|
Confusion Matrix:
DataSet 1
|
DataSet 2
|
DataSet 3
|
a b <-- classified as
422 78
| a=tested_negative
104 164
| b=tested_positive
|
a
b c d
<-- classified as
1161 48
1 0 | a = unacc
104 271
9 0 | b
= acc
1
47 19 2 |
c = good
0
37 1 27 | d
= vgood
|
a b
c d e
f g <-- classified as
51 5
11 0
0 2 1 |
a = build wind float
48 13
6 0 5
3 1 | b = build wind non-float
12
0 4 0
0 1 0 |
c = vehic wind float
0
0 0 0
0 0 0 |
d = vehic wind non-float
0
8 0 0
4 0 1 |
e = containers
0
0 0 0
0 8 1 |
f = tableware
1
1 0 0
3 0 24 | g = headlamps
|
3.
NaiveBayesMultinomial:
Classifier:
Naive
|
Precision
|
Recall
|
F-Measure
|
Class
|
DataSet 1
|
0.698
0.429
|
0.678
0.451
|
0.688
0.440
|
tested_negative
tested_positive
|
DataSet 2
|
-------
|
-------
|
--------
|
-------
|
DataSet 3
|
0.548
0.460
0.000
0.000
0.500
0.000
0.657
|
0.657
0.526
0.000
0.000
0.231
0.000
0.793
|
0.597
0.491
0.000
0.000
0.316
0.000
0.719
|
build wind
float
build wind
non-float
vehic wind
float
vehic wind
non-float
containers
tableware
headlamps
|
Weighted
Avg. DataSet-1
|
0.604
|
0.599
|
0.601
|
|
DataSet-2
|
-----
|
-----
|
-------
|
|
DataSet-3
|
0.462
|
0.523
|
0.486
|
Confusion Matrix:
DataSet 1
|
DataSet 2
|
DataSet 3
|
a b <-- classified as
339 161 |
a=tested_negative
147 121 |
b=tested_positive
|
----------------------------------
|
a b
c d e
f g <-- classified as
46 24
0 0 0
0 0 | a = build wind float
28 40
0 0 2
2 4 | b = build wind non-float
9
8 0 0
0 0 0 |
c = vehic wind float
0
0 0 0
0 0 0 |
d = vehic wind non-float
0
5 0 0
3 0 5 |
e = containers
0
6 0 0
0 0 3 |
f = tableware
1
4 0 0
1 0 23 | g = headlamps
|
4.
NaiveBayesMultinomialUpdateable:
Classifier:
Naive
|
Precision
|
Recall
|
F-Measure
|
Class
|
DataSet 1
|
0.694
0.427
|
0.686
0.437
|
0.690
0.432
|
tested_negative
tested_positive
|
DataSet 2
|
--------
|
--------
|
--------
|
-------------
|
DataSet 3
|
1.000
0.387
0.000
0.000
1.000
0.000
0.938
|
0.029
0.987
0.000
0.000
0.154
0.000
0.517
|
0.056
0.556
0.000
0.000
0.267
0.000
0.667
|
build wind
float
build wind
non-float
vehic wind
float
vehic wind
non-float
containers
tableware
headlamps
|
Weighted
Avg. DataSet-1
|
0.601
|
0.599
|
0.600
|
|
DataSet-2
|
------------
|
------------
|
---------
|
|
DataSet-3
|
0.652
|
0.439
|
0.322
|
Confusion Matrix:
DataSet 1
|
DataSet 2
|
DataSet 3
|
a b <-- classified as
343 157 |
a=tested_negative
151 117 |
b=tested_positive
|
a b
c d e
f g <-- classified as
2 68
0 0 0
0 0 | a = build wind float
0 75
0 0 0
0 1 | b = build wind non-float
0 17
0 0 0
0 0 | c = vehic wind float
0
0 0 0
0 0 0 |
d = vehic wind non-float
0 11
0 0 2
0 0 | e = containers
0
9 0 0
0 0 0 |
f = tableware
0 14
0 0 0 0
15 | g = headlamps
|
5.
NaiveBayesUpdateable
:
Classifier:
Naive
|
Precision
|
Recall
|
F-Measure
|
Class
|
DataSet 1
|
0.802
0.678
|
0.844
0.612
|
0.823
0.643
|
tested_negative
tested_positive
|
DataSet 2
|
0.917
0.672
0.633
0.931
|
0.960
0.706
0.275
0.415
|
0.938
0.689
0.384
0.574
|
unacc
acc
good
vgood
|
DataSet 3
|
0.455
0.481
0.190
0.00
0.333
0.571
0.857
|
0.729
0.171
0.235
0.000
0.308
0.889
0.828
|
0.560
0.252
0.211
0.000
0.320
0.696
0.842
|
build wind
float
build wind
non-float
vehic wind
float
vehic wind
non-float
containers
tableware
headlamps
|
Weighted
Avg. DataSet-1
|
0.759
|
0.763
|
0.760
|
|
DataSet-2
|
0.852
|
0.855
|
0.847
|
|
DataSet-3
|
0.496
|
0.486
|
0.453
|
Confusion Matrix:
DataSet 1
|
DataSet 2
|
DataSet 3
|
a b <-- classified as
422 78 |
a=tested_negative
104 164 |
b=tested_positive
|
a b
c d <-- classified as
1161 48
1 0 | a = unacc
104 271
9 0 | b
= acc
1
47 19 2 | c
= good
0
37 1 27 |
d= vgood
|
a b
c d e
f g <-- classified as
51 5
11 0
0 2 1 |
a = build wind float
48 13
6 0 5
3 1 | b = build wind non-float
12
0 4 0
0 1 0 |
c = vehic wind float
0
0 0 0
0 0 0 |
d = vehic wind non-float
0
8 0 0
4 0 1 |
e = containers
0
0 0 0
0 8 1 |
f = tableware
1
1 0 0
3 0 24 | g = headlamps
|
6.
NaiveBayesMultinomialText
:
Classifier:
Naive
|
Precision
|
Recall
|
F-Measure
|
Class
|
DataSet 1
|
0.651
0.000
|
1.000
0.000
|
0.789
0.000
|
tested_negative
tested_positive
|
DataSet 2
|
0.700
0.000
0.000
0.000
|
1.000
0.000
0.000
0.000
|
0.824
0.000
0.000
0.000
|
unacc
acc
good
vgood
|
DataSet 3
|
0.000
0.355
0.000
0.000
0.000
0.000
0.000
|
0.000
1.000
0.000
0.000
0.000
0.000
0.000
|
0.000
0.524
0.000
0.000
0.000
0.000
0.000
|
build wind
float
build wind
non-float
vehic wind
float
vehic wind
non-float
containers
tableware
headlamps
|
Weighted
Avg. DataSet-1
|
0.424
|
0.651
|
0.513
|
|
DataSet-2
|
0.490
|
0.700
|
0.577
|
|
DataSet-3
|
0.126
|
0.355
|
0.186
|
Confusion Matrix:
DataSet 1
|
DataSet 2
|
DataSet 3
|
a b <-- classified as
500 0 |
a = tested_negative
268 0 |
b = tested_positive
|
a b
c d <-- classified as
1210 0
0 0 | a = unacc
384 0
0 0 | b = acc
69 0
0 0 | c = good
65 0
0 0 | d = vgood
|
a b
c d e
f g <-- classified as
0 70
0 0 0
0 0 | a = build wind float
0 76
0 0 0
0 0 | b = build wind non-float
0 17
0 0 0
0 0 | c = vehic wind float
0
0 0 0
0 0 0 |
d = vehic wind non-float
0 13
0 0 0
0 0 | e = containers
0
9 0 0
0 0 0 |
f = tableware
0 29
0 0 0
0 0 | g = headlamps
|