Genotype Sequence Segmentation

October 3rd, 2009 Leave a comment Go to comments

Recombination plays an important role in shaping the genetic variations present in current-day populations. We consider populations evolved from a small number of founders, where each individual’s genomic sequence is composed of segments from the founders. We study the problem of segmenting the genotype sequences into the minimum number of segments attributable to the founder sequences. The minimum segmentation can be used for inferring the relationship among sequences to identify the genetic basis of traits, which is important for disease association studies.

In this project, we propose two dynamic programming algorithms to compute the minimum segmentations for genotype sequences. Our algorithms run in polynomial time and consider biological constraints of the genotype segmentation problem, i.e., the number of segments in both haplotypes are comparable. Moreover, our algorithms account for the potential noise sources in the data including point mutations, gene conversions, genotyping errors, and missing values. [paper]

This tool is currently being rewritten to utilize our computing cluster.

Research Sponsor

NSF IIS 0448392: “CAREER: Mining Salient Localized Patterns in Complex Data”
NSF IIS 0812464: “III-Core: Discovering and Exploring Patterns in Subspaces”

  1. No comments yet.
  1. No trackbacks yet.
You must be logged in to post a comment.