DSpace Repository

Cleaning and clustering of sensor data by k - means algorithm for efficient query processing

Show simple item record

dc.contributor.advisor Latiful Hoque, Dr. Abu Sayed Md.
dc.contributor.author Muhidul Islam Khan, Md.
dc.date.accessioned 2015-12-26T10:38:47Z
dc.date.available 2015-12-26T10:38:47Z
dc.date.issued 2009-08
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/1560
dc.description.abstract The way of collecting sensor data will face a revolution when the newly developing technology of distributed sensor networks becomes fully functional and widely available. Distributed sensor networks are indeed an attractive technology, but the program/stack memory and the battery life of today nodes do not enable complex data mining in runtime. Effective data mining can be implemented on the central base station, where the computational power is not generally constrained. Today's real-world databases are highly susceptible to noisy, missing and inconsistent data because of their typically huge size and their likely origin from multiple, heterogeneous sources. Low-quality data will lead to low-quality mining results. There are many possible reasons for noisy data (having incorrect attribute values). The data collection sensor nodes used may be faulty. Errors in data transmission can also occur. There may be technology limitations, such as limited buffer size for coordinating synchronized data transfer and consumption. In:correct data may also result from inconsistencies in naming conventions or data codes used or inconsistent formats for input fields. Duplicate tuples also require data cleaning. Preprocessing is required to remove noisy, missing and inconsistent data for efficient mining in Wireless Sensor Networks (WSN) data. A number of research works have been done for mining WSN data. No research work has been found to be done on pre-. processing the WSN data for efficient query processing. In: this project, we have evaluated a number of statistical techniques to handle missing data. Among these techniques, mean before after is found most suitable for handling missing data. We have . implemented the Approximate Duplicilte Record Detection method to remove the duplicate records from a dataset. We have used some WSN datasets available in the internet for experimental purpose. Kmeans Algorithm has been applied for clustering the dataset. Cleaned and clustered dataset has shown better performance for query processing than dirty and non clustered data. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering, BUET en_US
dc.subject Data mining en_US
dc.title Cleaning and clustering of sensor data by k - means algorithm for efficient query processing en_US
dc.type Thesis-MSc en_US
dc.contributor.id 100705049 P en_US
dc.identifier.accessionNumber 107379
dc.contributor.callno 005.759/MUH/2009 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account