DSpace Repository

Improving query execution performance for database of big data

Show simple item record

dc.contributor.advisor Adnan, Dr. Muhammad Abdullah
dc.contributor.author Mosharraf, Sharafat Ibn Mollah
dc.date.accessioned 2019-03-18T04:12:16Z
dc.date.available 2019-03-18T04:12:16Z
dc.date.issued 2018-09-24
dc.identifier.uri http://lib.buet.ac.bd:8080/xmlui/handle/123456789/5146
dc.description.abstract Performance is a critical concern when reading and writing data from billions of records stored in a Big Data warehouse. Many researchers have proposed improving query execution performance in distributed Big Data systems by introducing efficient techniques such as indexing, caching, filtering, map-reduce, query execution plan, data partitioning, etc. In this thesis, we introduce two other scopes for query performance improvement. One is to improve performance of lookup queries after data deletion in Big Data systems that use the Eventual Consistency model. We propose a scheme to improve performance of lookup queries after data deletion by replacing Bloom Filter with a better probabilistic data structure called Cuckoo Filter that supports deletion of elements. Another scope for query performance improvement is to avoid unnecessary network round-trip for query execution in remote nodes in a Big Data cluster when it is known that the nodes do not have the requested partition of data. We propose a scheme using probabilistic filters that are looked up before delegating a query execution to remote nodes, so that queries resulting in no data can be skipped from passing through the network. We evaluate our schemes with a popular Big Data database (Cassandra) and show that each scheme can improve performance of lookup queries for up to 100%. We also show that the proposed schemes do not degrade performance of other data manipulation queries as a side effect. en_US
dc.language.iso en en_US
dc.publisher Department of computer Science and Engineering en_US
dc.subject Big data en_US
dc.title Improving query execution performance for database of big data en_US
dc.type Thesis-MSc en_US
dc.contributor.id 0412052019 en_US
dc.identifier.accessionNumber 116869
dc.contributor.callno 005.7/SHA/2018 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search BUET IR


Advanced Search

Browse

My Account