Abstract:
Structural alignment is the process of finding similarities between a pair of proteins based
on their three-dimensional shape. Accurate detection of such similarities could reveal
evolutionary relationships and predict the functions of different proteins. The algorithmic
solutions proposed so far rely on heuristics or approximation methods, but these could
provably deliver poor results in general. Moreover, exact solutions tend to be very slow
in practice. In this thesis we consider a graph-theoretic approach, which consists of
representing tertiary protein structures as labeled graphs, and interpreting the structural
alignment problem as a particular case of the Maximum Common Subgraph problem.
We discuss the utility of employing a recent algorithmic technique for solving Maximum
Common Subgraph. When applied to pairs of proteins, obtained from the Protein Data
Bank (PDB), the proposed algorithm is shown to deliver reasonably accurate results. In
order to improve the time-efficiency of this approach, a scalable parallel version is
presented.