Abstract:
Since the discovery of the Huffman encoding scheme in 1952, Huffman codes
have been widely used in the efficient storing of data, image and video. Huffman
coding has been subjected to numerous investigations in the past 60 years. Many
techniques have been proposed since then. But still this is an important field as it
significantly reduces storage requin:ment and communication cost. In this thesis,
"Study on Huffman Coding", we have directed our works mainly to Repeated
Huffman Coding that is a lossless compression technique.
The application of Huffman coding technique again and again on a file, then it is
called Repeated Huffman coding. While it is expected that encoded message
length will be smaller in every pass of Repeated Huffman coding, nevertheless
encoding the tree itself will be an overhead in each pass. So repetition count will
.depend upon how efficiently we can represent a Huffman tree.
A memory efficient representation of a Huffman tree is suggested in this thesis.
This representation is very important in the context of Repeated Huffman coding
where ultimate compression ratio depends upon the efficiency of encoding the
Huffman tree. It reduces overhead which incurs in every pass of Repeated
Huffman coding. Because of reduction of overhead, the Huffman Coding scheme
can be applied effectively on a file for a larger number of iterations.
The proposed algorithms require less than 15n/4l memory spaces in the worst case
to represent a Huffman tree for n distinct symbols, which is an improvement over
the existing technique that requires at most 13n/2l memory spaces.
Such a saving in the representation of Huffman trees will improve the
performance of the Repeated Huffman coding as well as Block Huffman coding
where multiple tree headers must be stored for a file oflarge size.
The tree clustering algorithm is also analyzed to get an optimal clustering of a
Huffman tree so that along with reduction of memory requirement search process
.for a symbol in a Huffman tree is speeded up.