DSpace Repository

Conceptual clustering and classification of information using vector space model

Show simple item record

dc.contributor.advisor Rahman, Dr. Chowdhury Mofizur
dc.contributor.author Faruk Ahmed, Md.
dc.date.accessioned 2016-01-10T03:47:24Z
dc.date.available 2016-01-10T03:47:24Z
dc.date.issued 2002-10
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/1625
dc.description.abstract Document clustering is a popular t,ool for organizing a large collection of documents. Clustering algorithms are usually applied on documents, represented as vectors, in a high dimensional term space. The main two problems related to such clu~tering approach are accurately cluster the co-related documents and determine the proper number of clust,ers. The first feature is being analyzed in current literature in different ways including active CltlStering, partitional k-means algorithm, project,ion based methods including LSI, self-organizing maps, multi dimensional scaling, graph-theoretic techniques and many more. As for the second feature most of the clustering approaches assumes the number of clusters as a pre-requisite quantity such in case of Markov State Cluster, partitional methods and most of the graphtheoretic techniques. A few of the clustering algorithms have been analyzed those can automatically determine the number of clusters. A popular approach is based on the idea borrowed from Principal Component Analysis. Another approach uses self-refinement process of discriminative feature identification and cluster label voting to converge to optimal number of clusters. In this work we have implemented iterative solution with inductive knowledge base to achieve the optimal clustering. Both the inter-cluster distance and number of clusters are iteratively varied to have this optimization. This new technique to determine the number of clusters and document clustering shows promising result with 81% percent clustering accuracy. For classification we studied unsupervised clustering technique together with the group vector that also minimizes the computational cost that is usually associated with ordinary classification approaches. The outcome reveals comparable result to current practices and gives 78% classification accuracy. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering, BUET en_US
dc.subject Cluster analysis-Computer programme en_US
dc.title Conceptual clustering and classification of information using vector space model en_US
dc.type Thesis-MSc en_US
dc.contributor.id 040005017 F en_US
dc.identifier.accessionNumber 98233
dc.contributor.callno 005.1/FAR/2002 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account