New learning algorithm of native bayesian classification

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

New learning algorithm of native bayesian classification

Shamsul Huda, Md.

URI: http://lib.buet.ac.bd:8080/xmlui/handle/123456789/1605

Date: 2003-05

Abstract:

Naive Bayes (NB) is one of the most efficient and effective learning algorithms for machine leaming and data mining tasks due to its linear computational and memory complexities and easier implementation technique. The main focus of NB is the simplification of Bayes Optimal Classifier. A common problem in Bayes Optimal Classifier is the direct estimation of class-conditional probability distribution (CPD) hom a given training data set with high dimensional feature space while finding thc maximum a posteriori probability (MAP) hypothesis for a given example whose prediction is not specified. Estimation of CPD from a given training data set with high dimensional feature space requires that every possible combination of attribute values must be available in training data which is usually not found in real life learning domains. NB uses some approximations to eliminate this problem by using the simplifying assumption that attribute values are conditionally independent given the class values. If all the attributes are truly independent, NB makes the same prediction as Bayes Optimal Classifier and NB is said to be working perfectly. But this independence assumption is almost always violated in practice and as a result classification accuracy of NB degrades in a large number of leaming domains. In this thesis, wc propose a new learning algorithm of Naive Bayesian classification to alleviate the independence assumption problem in NB: thereby, improving the performance of NB and sustaining its optimality to all learning domains which makes it universal. In our algorithm, a measure of attribute dependence is considered for each attrihute. Attrihute dependence of each attribute on other attributes is estimated from training data sel with the help of dependency equation. Most interdependent attributes are selected based on their dependency and by applying a leave-one-out cross validation on training data set. A subset of examples is chosen using these attributes with their values in a test example. A local NB is applied on this subset to classify the test example. The algorithm has been tested on a wide range of natural and al1ificial learning domains taken from UCI machine learning repository. Experimental result shows that the new algorithm obtains a lower error rate than that of NB classifier, BSEJ and LBR. In some domains its enor rate is lower than that of modern decision tree learning algorithm C4.5, LAZYDT also.

Show full item record

Files in this item

Name: Full Thesis .pdf

Size: 2.118Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Dissertations/Theses - Department of Computer Science and Engineering
Post graduate dissertations (Theses) of Computer Science Engineering (CSE)

New learning algorithm of native bayesian classification

New learning algorithm of native bayesian classification

Abstract:

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account