Archive

Archive for September, 2007

References for Interval Paper

September 11th, 2007 Leonard McMillan No comments
  1. Efficient computation of close lower and upper bounds on the minimum number of recombinations in biological sequence evolution, Yun S. Song, Yufeng Wu and Dan Gusfield, Bioinformatics 2005 21(Suppl 1): i413-i422.
  2. A linear-time algorithm for the perfect phylogeny haplotyping (PPH) problem, Zhihong Ding, Vladimir Filkov, Dan Gusfield. Journal of Computational Biology. 2006, 13(2): 522-553.
  3. Inference about Recombination from Haplotype Data: Lower Bounds and Recombination Hotspots, Vineet Bafna and Vikas Bansal, Journal of Computational Biology. 2006, 13(2): 501-521.
  4. Perfect phylogeny and haplotype assignment, E. Halperin and R. M. Karp, RECOMB ‘04.
  5. Whole genome association mapping by incompatibilities and local perfect phylogenies, Thomas Mailund, Søren Besenbacher, and Mikkel H Schierup, BMC Bioinformatics 2006, 7:454
Categories: Uncategorized Tags:

Ideas about the new phylogeny tree construction algorithm

September 3rd, 2007 Feng Pan No comments

  The new algorithm processes SNPs one by one and constructs a phylogeny tree. Here is an example to illustrate the basic idea of the algorithm.

Example data: (4 strains, a,b,c and d. Each row is a strain, each column is a SNP)

a: 1   0   0   1   1
b: 0  0   1   0   1
c: 1   1    1   0   1
d: 1   0   1   1   0

The steps to construct the tree

In each step, the algorithm splits a node in the tree according to the new SNP. Here we assume that the new SNP is compatible with all the previous SNPs.  The algorithm is able to find out whether the new SNP is compatible or not during the construction.

Some key procedures in the algorithm

(1) given a new SNP, find the node in the tree to be splitted.

      Given two SNP, s4, s5
           s4    s5
          1      1
          0      1
          0      1
          1      0
       The strains that have ‘1′ on s4 are splitted by s5, because they have differnt allels on s5. While the strains that have ‘0′ on s4 are not splitted by s5, becuase they still have the same allel on s5. We say that s5 split s4 on ‘1′. According to the 4 gamete rule, two unique SNPs are compatible if both of them split either ‘0′ or ‘1′ of the other SNP.

       Then given a new SNP si, for each previous SNP sj, j from 1 to i-1, we can find how sj is splitted by si. For example, SNP s4 splits s1 on ‘1′, s2 on ‘0′ and s3 on ‘1′. We get a strain ‘101′,  if we concatenate these numbers.  This strain identifies an unique node in the tree which is exactly the node that need to be splitted in the current step. For example, ‘101′ identifies node ‘d’ in the current tree and it needs to be splitted in step 4. Note that, since only SNPs s1,s2 and s3 have been processed in the previous steps, node ‘d’ currently only represents the first 3 allels of strain d which is ‘101′.  

   There may be other ways to find this node. For example, the nodes in the tree can be divided into two groups according to their allels on the current SNP position. And the splitting node can be found accordingly. However, this method seems to be more complicated  and not consistent. 

(2) after the algorithm finds the node to be splitted, two new nodes are constructed. The strains represented by these two nodes have the same allels on all the previous SNPs and have the opposite allels on the current SNP. All the neighbors of the splitted nodes are re-linked to the two new nodes according to their allels on the current SNP, i.e., if it is 0, the neighbor node is linked to the new node having 0 on the current SNP, vice versa.

(3) for those additional nodes who represent the strains which are not in the given data, the allels on all the processed SNP positoins need to be decided according to the tree structure.¼/p>

Categories: Ideas, Research Tags:

CompGen Meeting

August 17th, 2007 Leonard McMillan No comments
September 27, 2007
3:30 pmto5:00 pm

Email Leonard any agenda items.

Categories: Events Tags:

CompGen Meeting

August 17th, 2007 Leonard McMillan No comments
September 20, 2007
3:30 pmto5:00 pm

This week Lynda will discuss her tool for selecting optimal validating F2 crosses when given a set of QTLs identified in a panel of RIL-strain test trials.

Categories: Events Tags:

CompGen Meeting

August 17th, 2007 Leonard McMillan No comments
September 13, 2007
3:30 pmto5:00 pm

Prof Wei Wang and I will discuss new approaches to linkage and association mapping of QTL. Wei will discuss approaches for correlating phenotypes to phylogenetic trees, and assessing the quality of their match. I will discuss approaches that resample at target positions (e.g. genes) rather than being limited to evaluations only at marker positions, by taking advantage of genome interval assignments.

A related paper can be found at:

Whole genome association mapping by incompatibilities and local perfect phylogenies

Categories: Events Tags:

CompGen Meeting

August 17th, 2007 Leonard McMillan No comments
September 6, 2007
3:30 pmto5:00 pm

Email Leonard any agenda items

Categories: Events Tags: