Patricia Evans: Research
Most of this research focuses on Computational Biology, or Bioinformatics:
applying
computer science techniques to problems encountered in the field of Molecular
Biology, in situations that can be helped by computation. This work
spans from the more theoretical, such as the complexity of
finding common approximate substrings, to the more applied, such as designing
and building new tools for the Canadian Potato Genome Project. My interests
lie most specifically with applying relevant aspects of computer science
theory, such as algorithm design, graph theory, and parameterized complexity,
to computational biology. Ther are also potential applications to hypertext
analysis.
Partly funded by:
Projects:
Common Approximate Substrings
One form of motif finding, common for DNA or RNA sequences, is to look for
substrings that are found with a few mismatches in all (or most) of a set of
sequences. This work was a collaboration with Andrew Smith (recent UNB CS PhD
graduate) and Todd Wareham
of Memorial University, and also included collaboration with researchers (Ken Kent of UNB and Jackie Rice of U.Lethbridge) on FPGA designs to solve the problem.
Publications:
P.A. Evans, A.D. Smith, and H.T. Wareham.
On the complexity of finding common approximate substrings.
Theoretical Computer Science, 306(1-3), 407-430.
P.A. Evans and A.D. Smith. Hardness of approximating closest
substring problems.
Proceedings of 14th International Symposium on Foundations of
Complexity Theory (FCT 2003), Springer-Verlag LNCS 2748 (2003),
210-221.
P.A. Evans and A.D. Smith. Towards optimal motif enumeration.
Proceedings of Workshop on Algorithms and Data Structures (WADS 2003),
Springer-Verlag LNCS 2751 (2003), 47-58.
P.A. Evans, A.D. Smith, and H.T. Wareham.
The Parameterized Complexity
of p-Center Approximate Substring Problems.
Technical report TR01-149,
Faculty of Computer Science, University of New Brunswick.
K. Kent, S. Van Schaick, J. Rice, and P.A. Evans. Hardware-based implementation
of the common approximate substring algorithm.
Proceedings of 8th Euromicro Conference on Digital Systems Design (2005),
314-321.
RNA Structure Comparison and Analysis
RNA molecules fold to form 3-d structures based on largely pairwise bonds
between nucleic acids in the chain. These structures can be quite complex,
but have characteristics that can be used to help in their analysis.
Finding common substructures asssists in RNA classification and the
analysis of their functions. The complexity of these algorithms also led to
research into parallelizing them and other complex dynamic programming
recurrences, in collaboration with Dr. Eric Aubanel at UNB.
Publications:
P.A. Evans. Finding common RNA pseuodknot structures in polynomial time.
Proceedings of Combinatorial Pattern Matching (CPM 2006),
Springer-Verlag LNCS 4009 (2006), 223-232.
P.A. Evans and H.T. Wareham. Exact algorithms for computing pairwise
alignments and 3-medians from structure-annotated sequences.
Pacific Symposium on Biocomputing 6 (2001), 559-570.
P.A. Evans. Finding common subsequences with arcs and pseudoknots.
Proceedings of Combinatorial Pattern Matching (CPM 1999),
Springer-Verlag LNCS 1645
(1999),
270-280.
P.A. Evans.
Algorithms and Complexity for Annotated Sequence Analysis.
Ph.D. Thesis, University of Victoria, 1999.
Haplotyping and Phylogeny
Genetic information from organisms can be used to determine their relationships.
Alternatively, genotypes from individuals in known family relationships can
be used to determine their haplotypes.
Publications:
D. Doan and P.A. Evans. A haplotyping algorithm for non-recombinant pedigree
data containing missing members. Proceedings of the IEEE International
Conference on Bioinformatics and Biomedicine (BIBM) (2007), 275-281.
T. Asano, P. Evans, R. Uehara, and G. Valiente. Site consistency in phylogenetic
networks. Algorithms in Bioinformatics 6, London:College Publications,
2006, 15-26.
In this general area of research, I am also involved with an AIF-funded commercial
project developing a DNA-based tracking system for salmon farming.
Metabolic Network Analysis
We are currently exploring potentially
interesting problems in metabolic and regulatory network analysis, including
network comparison and network correction.
The Canadian Potato Genome Project
UNB Bioinformatics and GEELS researchers worked on potato research
in partnership with personnel at Genome Atlantic, Bioatlantech, the Government
of Canada's Potato Research Centre, Nova Scotia Agricultural College, and
Carleton University. The Bioinformatics Research component was co-led by
Dr. Patricia Evans and Dr. Virendra Bhavsar. We developed
tools to assist the project's potato genome researchers in their task of
discovering information about genes.
Funded by:
Applications of Parameterized Complexity
In addition to the uses of parameterized complexity in some of the above projects,
I am also working on applying the techniques to privacy problems,
and am involved with a larger project using parameterized complexity to investigate
issues with cognitive models such as analogy mapping. This latter project is
led by Iris van Rooij at Nijmegen.
Publications:
R. Chaytor, P.A. Evans, and H.T. Wareham. Fixed-parameter tractability of
anonymizing data by suppressing entries. Proceedings of the 2nd Annual
International Conference on Combinatorial Optimization and Applications (COCOA 2008),Springer-Verlag LNCS 5165 (2008), 23-31.
I. van Rooij, P. Evans, M. Muller, T. Gedge, and H.T. Wareham. Identifying
sources of intractability in cognitive models: an illustration using analogical
structure mapping. Proceedings of the 30th Annual Conference of the Cognitive
Science Society (CogSci 2008), 6 pgs.
Students:
Alumni:
- Jian Peng (MCS, 2000)
- Andrew Smith (PhD, 2004)
- Aijazuddin Syed (MCS, 2005)
- Marc Cooper (MCS, 2005)
- Andrew McKim (MCS, 2007)
- Zheng Wang (MCS, 2007)
- Mark Dowe (MCS, 2008)
- Eric Snow (MCS, 2009)
- Emad Bahrami Samani (MCS, 2009)
- En Zhang (MCS, 2010)
- Duong (Peter) Doan (PhD, 2011)
- Rachita Sharma (PhD)
- Steven Stewart (MCS)
Document: http://www.cs.unb.ca/profs/pevans/bioinfor.html
Last Revision : October 21, 2008