Abstract:
Three dimensional structure prediction of a protein from its amino acid sequence, known
as protein folding, is one of the most studied computational problem in bioinformatics and
computational biology. Since, this is a hard problem, a number of simpli ed models have
been proposed in literature to capture the essential properties of this problem. An important
class of simpli ed models are known as the lattice models. Lattice models have proven to
be extremely useful tools for reasoning about the complexity of protein folding problems.
Hydrophobic-Polar (HP) model is one of the lattice models where the main force in the
folding process is the hydrophobic-hydrophobic force.
In this thesis, we present three approximation algorithms to solve the protein folding
problem for di erent lattices in HP model. Our rst approxiation algorithm solves protein
folding problem in 2D triangular lattice. The algorithm is polynomial in terms of the length
of the given HP string. The expected approximation ratio of our algorithm is 1
2 log n
n 1
for
n 6, where n2 is the total number of H's in a given HP string. The expected approximation
ratio tends to reach 1 for large values of n. Hence our algorithm is expected to perform very
well for larger HP strings. Our second and third approximation algorithms solve protein
folding problem on hexagonal lattices with diagonals. The second algorithm, based on
partitioning approaches, achieves an approximation ratio of
3
5
. We furthermore present
third algorithm, which gives better approximation ratio than
3
5
for most of the case. We
also brie
y discuss the issue of the reduced alphabet of amino acids. We present the energy
matrix which is the building block of the protein folding problem for some reduced alphabets.