Distributed data warehouse management in a parallel environment using compressed relational structure

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

dc.contributor.advisor	Latiful Hoque, Dr. Abu Sayed Md.
dc.contributor.author	Fazlul Hasan Siddiqui
dc.date.accessioned	2015-12-19T09:04:48Z
dc.date.available	2015-12-19T09:04:48Z
dc.date.issued	2007-02
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/1544
dc.description.abstract	Data warehousing is a key technology in everyday activity, which usually contains historical data derived from transaction data, but it can include data from other sources. The objective of a data warehouse is to provide analysts and managers with strategic information underlying the business to consolidate data from several sources. Unfortunately, the emergence of e-application has been creating extremely high volume of data that reaches to .terabyte threshold. The conventional data warehouse management system is costlier in terms of storage space and processing speed, and sometimes it is unable to handle such huge amount of data. As a result, queries and analyses are becoming more complex and time consuming. Therefore, there is a crucial need for the new algorithms and techniques to store and manipulate these data. Parallel and distributed data warehouse architectures have been evolved to support online queries on massive data in a short time. The database compression can be used for scalable storage and faster data access. In this thesis, we have presented a compressionbased distributed data warehouse architecture for storage of warehouse data, and support online queries efficiently. We have achieved a factor of 25-30 compression compared to conventional SQL server data warehouse. The main computational component of data warehouse is the generation and querying on the data cube. Our algorithm generates data cube directl/from the compressed form of data in parallel. The reduction in the size of data cube is a factor of 30-45 compared to existing methods. The response time has also been significantly improved. These improvements are achieved by eliminating the suffix and prefix redundancy, virtual nature of the data cube, direct addressability of compressed form of data and parallel computation. Experimental evaluation shows the improved performance over the existing systems.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering, BUET	en_US
dc.subject	Data management	en_US
dc.title	Distributed data warehouse management in a parallel environment using compressed relational structure	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	040405047 P	en_US
dc.identifier.accessionNumber	103149
dc.contributor.callno	005.74/FAZ/2007	en_US