DSpace Repository

Label based ensemble farmework for multi-label data stream classification with recurring and novel class detection

Show simple item record

dc.contributor.advisor Monirul Islam, Dr. Md.
dc.contributor.author Sajjadur Rahman
dc.date.accessioned 2016-05-10T03:50:10Z
dc.date.available 2016-05-10T03:50:10Z
dc.date.issued 2014-06
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/2980
dc.description.abstract Of late, the advent of online social media has led to the inception of a new form of data stream called multi-label data stream, where each stream record carries multiple class labels and requires a classi er to associate multiple categories to each record. Data streams present several challenges that has to be dealt with by any stream classi cation model. Concept drifting, in nite length with nite memory and processing time are the challenges that have been addressed by the existing multi-label data stream classi cation models in literature. In real world applications that generate data streams, the amount of labeled data is usually very scarce compared to the entire stream. Moreover, with the ever changing nature of Internet and social media, the emergence new class of data in the stream is a common phenomenon. This phenomenon is known as concept evolution. When this emergence occurs periodically for some classes of data, it is called class recurrence. None of the existing methodologies address any of the issues of scarcity of labeled data, concept evolution and class recurrence. This thesis proposes a layered ensemble based classi cation framework (LEAD) for multi-label data streams. The primary component of our LEAD framework is a two layer ensemble architecture. The top layer of the ensemble architecture re ects the most recent concept of the data stream whereas the bottom layer represents the older concepts of the stream. As a result, the bottom layer enables LEAD to classify recurrent class instances. Moreover, the layered approach also helps to di erentiate between recurrent and novel class instances which signi cantly reduces the false alarm rate of novel class instance identi cation. LEAD deploys a fuzzy novel class detection technique to identify the emergence of novel concept(s) in the stream. The problem of limited amount of labeled data is handled by a deferred classi cation mechanism. This mechanism allows more labeled data to appear in the stream that may help the development of a more informed classi er. Experimental results show clearly that LEAD exhibits better performance than the baseline methods. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering (CSE) en_US
dc.subject Data-Computer systems en_US
dc.title Label based ensemble farmework for multi-label data stream classification with recurring and novel class detection en_US
dc.type Thesis-MSc en_US
dc.contributor.id 0412052022 P en_US
dc.identifier.accessionNumber 112733
dc.contributor.callno 005.7/SAJ/2014 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account