New learning algorithm of native bayesian classification

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

dc.contributor.advisor	Rahman, Dr. Chowdhury Mofizur
dc.contributor.author	Shamsul Huda, Md.
dc.date.accessioned	2016-01-06T09:18:18Z
dc.date.available	2016-01-06T09:18:18Z
dc.date.issued	2003-05
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/1605
dc.description.abstract	Naive Bayes (NB) is one of the most efficient and effective learning algorithms for machine leaming and data mining tasks due to its linear computational and memory complexities and easier implementation technique. The main focus of NB is the simplification of Bayes Optimal Classifier. A common problem in Bayes Optimal Classifier is the direct estimation of class-conditional probability distribution (CPD) hom a given training data set with high dimensional feature space while finding thc maximum a posteriori probability (MAP) hypothesis for a given example whose prediction is not specified. Estimation of CPD from a given training data set with high dimensional feature space requires that every possible combination of attribute values must be available in training data which is usually not found in real life learning domains. NB uses some approximations to eliminate this problem by using the simplifying assumption that attribute values are conditionally independent given the class values. If all the attributes are truly independent, NB makes the same prediction as Bayes Optimal Classifier and NB is said to be working perfectly. But this independence assumption is almost always violated in practice and as a result classification accuracy of NB degrades in a large number of leaming domains. In this thesis, wc propose a new learning algorithm of Naive Bayesian classification to alleviate the independence assumption problem in NB: thereby, improving the performance of NB and sustaining its optimality to all learning domains which makes it universal. In our algorithm, a measure of attribute dependence is considered for each attrihute. Attrihute dependence of each attribute on other attributes is estimated from training data sel with the help of dependency equation. Most interdependent attributes are selected based on their dependency and by applying a leave-one-out cross validation on training data set. A subset of examples is chosen using these attributes with their values in a test example. A local NB is applied on this subset to classify the test example. The algorithm has been tested on a wide range of natural and al1ificial learning domains taken from UCI machine learning repository. Experimental result shows that the new algorithm obtains a lower error rate than that of NB classifier, BSEJ and LBR. In some domains its enor rate is lower than that of modern decision tree learning algorithm C4.5, LAZYDT also.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering, BUET	en_US
dc.subject	Algorithm - Bayesian - Classification	en_US
dc.title	New learning algorithm of native bayesian classification	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	040005023 P	en_US
dc.identifier.accessionNumber	98033
dc.contributor.callno	SHA/2003	en_US