Scalable strorage in compressed representation for terabyte data management

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

Scalable strorage in compressed representation for terabyte data management

Abdur Rouf, Mohammad

URI: http://lib.buet.ac.bd:8080/xmlui/handle/123456789/1404

Date: 2006-04

Abstract:

The emergence of e-application has been creating extremely high volume of data that reaches to terabyte threshold. Many'organizations are producing data that are doubling every year. The conventional data management system is costlier in terms of storage space and processing speed, and sometimes it is unable to handle such huge amount of data. New algorithms and techniques need to develop to store and manipulate these data. The database compression can be used for scalable storage and faster data access. We propose compression based data management system architecture that can be used to handle terabyte level of relational data. The existing compression schemes e.g. HIBASE or Three Layer Database Compression Architecture work well for memory resident data and provide good performance. These are low cost solution for highperformance data management system but are not scalable to manage terabyte level of data. We have developed a disk based columnar multi-block vector structure (CMBVS) that can be used to store relational data in a compressed representation with direct addressability. Parallel data access can be achieved by distributing the vector structure into multiple servers to improve the scalability. The lowest layer of the model is the block structure to store the compressed representation of data. The next higher level is the vector-structure that relates the block structure to an attribute of the relational data model. The structures are capable of carrying out query directly on the compressed form of data. This reduces query time drastically. We have compared our system with conventional relational DBMS. The experimental results show that our system is about twenty five times efficient in storage cost and twenty-seven to seventy-seven times faster in retrieval time performance than that of the conventional systems.

Show full item record

Files in this item

Name: Full Thesis .pdf

Size: 1.049Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Dissertations/Theses - Department of Computer Science and Engineering
Post graduate dissertations (Theses) of Computer Science Engineering (CSE)

Scalable strorage in compressed representation for terabyte data management

Scalable strorage in compressed representation for terabyte data management

Abstract:

Files in this item

This item appears in the following Collection(s)

Search BUET IR

Browse

All of IR

This Collection

My Account