<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>UNC Computational Genetics</title>
	<atom:link href="http://compgen.unc.edu/wp/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://compgen.unc.edu/wp</link>
	<description>A systems genetics research team at UNC-Chapel Hill</description>
	<pubDate>Mon, 05 Oct 2009 13:28:21 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Home</title>
		<link>http://compgen.unc.edu/wp</link>
		<comments>http://compgen.unc.edu/wp#comments</comments>
		<pubDate>Fri, 06 Jul 2007 16:39:40 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/blog/?page_id=6</guid>
		<description><![CDATA[Welcome to the UNC Computational Genetics Working Group
We are a multidisciplinary research group focused on a new field of integrated biomedical research called systems genetics.   Systems genetics is a non-reductionist field that was simply not practical even a few years ago: it relies on diverse multiscale and multiorgan phenotype data sets obtained from [...]]]></description>
			<content:encoded><![CDATA[<h2>Welcome to the UNC Computational Genetics Working Group</h2>
<p><img title="Old Well Transcription Factor" src="/images/oldwelDNAl.png" alt="Old Well Transcription Factor" align="left" />We are a multidisciplinary research group focused on a new field of integrated biomedical research called <em>systems genetics</em>.   Systems genetics is a non-reductionist field that was simply not practical even a few years ago: it relies on diverse multiscale and multiorgan phenotype data sets obtained from large segregating populations. System genetics is a biological science that relies on statistical methods, advanced computational algorithms, visualization, and high-performance computing. Systems genetics has the goal and potential to dissect and reassemble complex molecular and phenotypic networks in the context of natural genetic variation. Our group is a collaboration between members of the Departments of Biostatistics, Computer Science, Environmental Science and Engineering, and Genetics.</p>
<h2>Meetings</h2>
<p>During the fall semester, the UNC Computational Genetics Working Group will hold its regular weekly group meetings on Thursday afternoons from 5:00-6:30 pm in <a title="Floorplan of Sitterson/Brooks" href="http://www.cs.unc.edu/Resources/Sitterson/Floorplans/SittBrooks-3.gif">FB007 of Brooks Hall, the Computer Science building</a>.</p>
<h2>News</h2>
<dl>
<dt><strong>March 4, 2009</strong></dt>
<dd>UNC Compgen has begun construction on a pool of computational resources to support bioinformatics research and the collaborative cross. Over the next month we&#8217;ll be bringing online a large storage array of NAS disks to store large, terabyte-size datasets as well as a high performance compute cluster to support the tools developed at UNC for bioinformatics research.</dd>
<dt><strong>December 5, 2008:</strong></dt>
<dd>In January, the U.S. Department of Energy’s (DOE) Oak Ridge National Laboratory (ORNL) is moving its unique colony of 8,000 mice, known as the <a href="http://compgen.unc.edu/?page_id=99">Collaborative Cross</a>, to the University of North Carolina at Chapel Hill. (<a href="http://genomics.unc.edu/articles/081205OakridgeMice.html">More information</a>)</dd>
<dt><strong>October 31, 2008:</strong></dt>
<dd> Compgen.unc.edu has been replaced with a new server. Databases and websites should be functional again. Please contact hulbert@email.unc.edu if you are having trouble accessing any resources.</dd>
<dt><strong>May 15, 2007:</strong></dt>
<dd> CompGen member Prof. Wei Wang, 2007-2008 recipient of the Phillip &amp; Ruth Hettleman Prize for Artistic and Scholarly Achievement, will be presenting her award lecture on <em>Surfing the Data Flood </em>at The Carolina Club from 2pm-4pm. Refreshments will be served. </dd>
</dl>
<h2>Research Sponsors</h2>
<p><a href="http://compgen.unc.edu/wp/?page_id=344"><strong>EPA STAR RD832720:</strong> &#8220;Environmental Bioinformatics Research Center to Support Computational Toxicology Applications&#8221;</a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=483"><strong>NSF IIS 0534580:</strong> &#8220;Visualizing and Exploring High-dimensional Data&#8221;</a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=358"><strong>NSF IIS 0448392:</strong> &#8220;CAREER: Mining Salient Localized Patterns in Complex Data&#8221;</a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=424"><strong>NSF IIS 0812464:</strong> &#8221; III-Core: Discovering and Exploring Patterns in Subspaces&#8221;</a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=356"><strong>NIH U01 CA105417:</strong> &#8220;Integrative Genetics of Cancer Susceptibility&#8221;</a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=473"><strong>NIH U01 CA134240: </strong> &#8221; Systems Genetics Research Consortium&#8221;</a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=354"><strong>NIH GM 076468:</strong> &#8220;The Center for Genome Dynamics at Jackson Laboratory:<br />
An NIGMS National Center of Systems Biology&#8221;</a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=398"><strong>UCRF:</strong> &#8220;University Cancer Research Fund&#8221;</a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=501"><strong>Microsoft Research Grant</strong></a></p>
<p><a href="http://compgen.unc.edu/wp/?page_id=366"><strong>Microsoft New Faculty Fellows</strong></a></p>
<h2>Join us!</h2>
<p>If you have a problem related to systems biology or genetics, are in need of an analysis or visualization tool for your data,  or you would like to join our research group, please contact Leonard McMillan via email ( <img title="Leonard's email" src="http://compgen.unc.edu/images/emails/mcmillan.gif" alt="Leonard's email" align="absmiddle" />).</p>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=6</wfw:commentRss>
		</item>
		<item>
		<title>Genotype Sequence Segmentation</title>
		<link>http://compgen.unc.edu/wp/?page_id=253</link>
		<comments>http://compgen.unc.edu/wp/?page_id=253#comments</comments>
		<pubDate>Mon, 13 Apr 2009 18:19:16 +0000</pubDate>
		<dc:creator>cwelsh</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/wp/?page_id=253</guid>
		<description><![CDATA[Recombination plays an important role in shaping the genetic variations present in current-day populations. We consider populations evolved from a small number of founders, where each individual&#8217;s genomic sequence is composed of segments from the founders. We study the problem of segmenting the genotype sequences into the minimum number of segments attributable to the founder [...]]]></description>
			<content:encoded><![CDATA[<p>Recombination plays an important role in shaping the genetic variations present in current-day populations. We consider populations evolved from a small number of founders, where each individual&#8217;s genomic sequence is composed of segments from the founders. We study the problem of segmenting the genotype sequences into the minimum number of segments attributable to the founder sequences. The minimum segmentation can be used for inferring the relationship among sequences to identify the genetic basis of traits, which is important for disease association studies.</p>
<p>In this project, we propose two dynamic programming algorithms to compute the minimum segmentations for genotype sequences. Our algorithms run in polynomial time and consider biological constraints of the genotype segmentation problem, <i>i.e.</i>, the number of segments in both haplotypes are comparable. Moreover, our algorithms account for the potential noise sources in the data including point mutations, gene conversions, genotyping errors, and missing values. <a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/07/minseg-final.pdf">[paper]</a></p>
<p>This tool is currently being rewritten to utilize our computing cluster.</p>
<h2>Research Sponsor</h2>
<p><a href="http://compgen.unc.edu/wp/?page_id=358"><b>NSF IIS 0448392</b>: &ldquo;CAREER: Mining Salient Localized Patterns in Complex Data&rdquo;</a><br />
<a href="http://compgen.unc.edu/wp/?page_id=424"><b>NSF IIS 0812464</b>: &ldquo;III-Core: Discovering and Exploring Patterns in Subspaces&rdquo;</a></p>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=253</wfw:commentRss>
		</item>
		<item>
		<title>Gene Expression Extract: Tool for extraction of subsets from gene expression data</title>
		<link>http://compgen.unc.edu/wp/?page_id=537</link>
		<comments>http://compgen.unc.edu/wp/?page_id=537#comments</comments>
		<pubDate>Thu, 03 Sep 2009 20:44:28 +0000</pubDate>
		<dc:creator>kemal</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/wp/?page_id=537</guid>
		<description><![CDATA[Gene Expression Extraction and Analysis Tool
This web tool allows one to extract subset of gene expression data by specifying subset of genes,probes and strains. Clustering analysis can also be done on extracted data. An algorithm called SAFE, is also integrated so that enrichment of biological pathways can be tested. 
Research Sponsor
EPA STAR RD832720: &#8220;Environmental Bioinformatics [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://compgen.unc.edu/GeneExprExtract">Gene Expression Extraction and Analysis Tool</a></p>
<p>This web tool allows one to extract subset of gene expression data by specifying subset of genes,probes and strains. Clustering analysis can also be done on extracted data. An algorithm called SAFE, is also integrated so that enrichment of biological pathways can be tested. </p>
<h2>Research Sponsor</h2>
<p><a href="http://compgen.unc.edu/wp/?page_id=344"><b>EPA STAR RD832720</b>: &#8220;Environmental Bioinformatics Research Center to Support Computational Toxicology Applications&#8221;</a></p>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=537</wfw:commentRss>
		</item>
		<item>
		<title>Environmental Bioinformatics Research Center to Support Computational Toxicology Applications</title>
		<link>http://compgen.unc.edu/wp/?page_id=344</link>
		<comments>http://compgen.unc.edu/wp/?page_id=344#comments</comments>
		<pubDate>Mon, 10 Aug 2009 01:42:48 +0000</pubDate>
		<dc:creator>jjsun</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/wp/?page_id=344</guid>
		<description><![CDATA[
EPA STAR RD832720 (October 1, 2005 ~ September 30, 2010) 

The objectives of the Carolina Environmental Bioinformatics Center are to enhance and advance the field of Computational Toxicology.  The Center develops novel analytic and computational methods, creates efficient user-friendly tools to disseminate the methods to the wider community, and applies the computational methods to [...]]]></description>
			<content:encoded><![CDATA[<p><img alt="" src="images/sponsorLogo/logo_epaseal.gif" /><br />
<b><a href="http://cfpub.epa.gov/ncer_abstracts/index.cfm/fuseaction/display.abstractDetail/abstract/7737">EPA STAR RD832720</a> (October 1, 2005 ~ September 30, 2010) </b></p>
<p>
The objectives of the Carolina Environmental Bioinformatics Center are to enhance and advance the field of Computational Toxicology.  The Center develops novel analytic and computational methods, creates efficient user-friendly tools to disseminate the methods to the wider community, and applies the computational methods to data from molecular toxicology and other studies.
</p>
<p>
The center is divided into three Research Projects and an Administrative Unit. </p>
<ul>
<li>Project 1 (Biostatistics in Computational Biology) provides biostatistical support to the Center, performs data analysis at the US EPA and develops new methods in collaboration with EPA personnel and the computational toxicology community.
<li>Project 2 (Chem-informatics) coordinates the compilation and mining of data from relevant external databases and performs analysis and methods development for investigating Quantitative Structure-Activity Relationships with burgeoning high-throughput chem-informatics data.
<li>Project 3 (Computational Infrastructure for Systems Toxicology) works to create a framework for merging data from various –omic technologies in a systems biology approach.
<li>The Administration Core provides staff and support to the Center, and provides oversight for each for the Functional Areas. Public Outreach and Translation Activity (POTA) ensures that the activities of the Center are translated into useable information and materials for the public and policy makers.
</ul>
</p>
<p>
The Center is advancing the field of computational toxicology through the development of new methods and tools, as well as through direct collaborative efforts with EPA and other environmental scientists. In each Project, new methods are being developed and published that represent the state-of-the-art. The tools developed within each project are disseminated, and will be useful both to trained bioinformatics scientists and bench scientists. The synthesis of data from a variety of sources will move the field of computational toxicology from a hypothesis-driven science toward a predictive science.
</p>
<h2>Personnel</h2>
<p><b>Investigators:</b></p>
<ul>
<li>Fred Wright (PI)</li>
<li>Fei Zhou (co-PI)</li>
<li>Ivan Rusyn (co-PI)</li>
<li><a href="http://www.cs.unc.edu/~mcmillan/">Leonard McMillan</a> (co-PI)</li>
</ul>
<p><b>Students:</b><br />
Kemal Pakatci</p>
<h2>Projects</h2>
<ul>
<li><a href="http://compgen.unc.edu/wp/?page_id=57">NPUTE</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=390">Genetic Diversity of Mus musculus Laboratory Strains</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=275">FastANONA: an Efficient Algorithm for Genome-Wide Association Study</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=95">FastMap</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=537">GeneExprExtract</a></li>
</ul>
<h2>Publications</h2>
<ol>
<li><a target="_blank" href="http://compgen.unc.edu/wp/wp-content/uploads/2007/07/ismb2007.pdf" title="Click to download PDF">Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows</a>, by Adam Roberts, Leonard McMillan, Wei Wang, Joel Parker, Ivan Rusyn, and David Threadgill, <strong><em>Proceedings of the 15th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB)</em></strong>, 2007.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2007/RobertsMamGenome2007.pdf" title="Click to download PDF">The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics</a>, by Adam Roberts, Fernando Pardo-Manuel de Villena, Wei Wang, Leonard McMillan, and David Threadgill, <strong><em>Mammalian Genome</em>, </strong>Aug 3, 2007.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/09/SIGKDD08.pdf">FastANOVA: an efficient algorithm for genome-wide association study</a>, by Xiang Zhang, Fei Zou, and Wei Wang. <em><strong>Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD&#8217;08)</strong></em>.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=344</wfw:commentRss>
		</item>
		<item>
		<title>CISGen: Systems Genetics of Psychiatric Disorders</title>
		<link>http://compgen.unc.edu/wp/?page_id=135</link>
		<comments>http://compgen.unc.edu/wp/?page_id=135#comments</comments>
		<pubDate>Wed, 05 Nov 2008 18:53:05 +0000</pubDate>
		<dc:creator>hulbert</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/?page_id=135</guid>
		<description><![CDATA[Overview
The success of the interdisciplinary genomics research teams at UNC-Chapel Hill provide an ideal setting for developing a systems genetics approach for exploring psychiatric disorders. The overarching goal of our group, the Center for Integrated Systems Genetics (CISGen), is to exploit and develop the Collaborative Cross (CC) mouse model of the heterogeneous human population to [...]]]></description>
			<content:encoded><![CDATA[<h2>Overview</h2>
<p>The success of the interdisciplinary genomics research teams at UNC-Chapel Hill provide an ideal setting for developing a systems genetics approach for exploring psychiatric disorders. The overarching goal of our group, the Center for Integrated Systems Genetics (CISGen), is to exploit and develop the Collaborative Cross (CC) mouse model of the heterogeneous human population to unearth the genetic and environmental determininants of the complex phenotypes inherent to psychiatry. Finding the determinants of such complex phenotypes has proven to be among the most intractable set of problems in all of biomedicine. Despite over a century of scientific study, there are few hard facts about the causes of core psychiatric diseases.Accomplishing this goal requires a diversity of scientific expertise - psychiatry, human genetics, mouse behavior, mouse genetics, statistical genetics, computational biology, and systems biology.</p>
<p><img src="/images/people/fernando.png" alt="Fernando Pardo-Manuel de Villena" /> <img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/patsullivan.png" alt="Pat Sullivan" /> <img src="/images/people/fred.png" alt="Fred Wright" /> <img src="/images/people/fei.png" alt="Fei Zou" /> <img src="/images/people/leonard.png" alt="Leonard McMillan" /> <img src="/images/people/wei.png" alt="Wei Wang" /> <img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/jimcrowley.png" alt="jimcrowley.png" /> <img src="/images/people/daniel-pomp.jpg" alt="Daniel Pomp" /> <img src="/images/people/david.png" alt="David Threadgill" /> <img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/weisun.png" alt="weisun.png" /></p>
<p>Our group includes 17 scientists with the diverse backgrounds committed to this challenge. Extensive interactions among scientists at UNC-Chapel Hill over the last five years have provided the collaborative backdrop for 21st century projects like ours. CISGen proposes to develop and prove a novel systems genetics platform for the molecular dissection of complex traits, first in mouse using a novel model population (F1 crosses (RIX) of Collaborative Cross (CC) strains currently under develpoment). And, if successful, then in humans. Moreover, the CISGen platform could be adapted for the study of many other biomedical disorders. This intention is novel, innovative, and has never been done previously on the scale we propose.</p>
<h2>Motivations</h2>
<p>Determining how genetic and environmental factors interact to yield phenotype diversity in complex traits has become the central question in understanding human genetics. While Genome-Wide Association Studies (GWAS) have provided abundant performance gains towards this goal, GWAS exhibit fundamental limitations in analyzing complex human traits. Human GWAS can resolve only simple models of disease while complex models are ubiquitous in organisms from yeast to mouse. Furthermore, many important human phenotypes are not amenable to GWAS even if the difficulties and expense of collecting sufficient sample sizes are surmountable. These issues are particularly salient for psychiatric phenotypes.</p>
<p align="center"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/cisgen-fig2a.png" alt="CISGEN Figure 2A" /></p>
<h2>Background and Significance</h2>
<p>The fundamental idea behind CISGen is to overcome several inescapable limitations to studying the genomic basis of complex traits in humans. Human diseases are exceptionally important, but humans are a poor experimental organism. The chief limitation is that only the simplest genetic models can be resolved with confidence, and these are only a fraction of the genomic search space. Complex models are difficult or impossible to resolve in human samples. Moreover, simple genetic models represent only a small portion of the effects seen. In particular, we are concerned with psychiatric disorders where genomic approaches are infamous for yielding false leads. CISGen proposes to use a novel mouse platform to screen the genomic search space in order to develop realistically complex models of how genetic variation, gene expression, and epigenetic features interact to impact a selected set of relevant phenotypes. These mouse phenotypes have been chosen for their parallels to human diseases and endophenotypes of direct relevance to psychiatry. We are aware that there are important dissimilarities between mouse and human consequent to differences in evolutionary history. However, our goal is to derive realistic models in mouse via a series of unbiased screens. Much of what we propose is impossible in humans given ethical, sample size, and cost constraints. Thus, we intend to avoid “burning” power in precious human samples by testing only high probability models derived from comprehensive analysis a controlled model population. The performance of Genome-Wide Association Studies (GWAS) has been exceptional, yielding many candidate genes associated with complex human diseases. However, GWAS is no panacea. CISGen is attempting to “get ahead of the curve” by anticipating its limitations and developing  a new platform that surmounts the limitations of human GWAS.</p>
<h2>Driving Problems</h2>
<p>To demonstrate our point, we have chosen to work on what are arguably among the hardest problems in biomedicine: psychiatric and related behavioral phenotypes. Psychiatric disorders are top-rank public health problems, are idiopathic, and  most likely have strong genetic influences. In designing CISGen, we immediately found congruence between the professional expertise of our investigators and the overall high-risk/high-gain philosophy. Schizophrenia (SCZ), Major Depressive Disorders (MDD), and autism have proven intractable to standard approaches in biomedicine. We thus chose to study these psychiatric disorders rather than a disorder for which GWAS is already offering promise (e.g., T2DM or Crohn‟s disease).</p>
<ul>
<li><em><strong>Autism</strong></em> is a neurodevelopmental disorder characterized by deficits in social behavior and communication, as well as ritualistic/repetitive behaviors. There is a broad range of severity of each of these core symptoms in autistic individuals and a sub-clinical phenotype in some family members. The variability of the disorder and its high heritability suggest that autism is a complex genetic disease requiring the perturbation of many loci to produce an autistic phenotype. The genetic basis of autism remains unclear. Recent studies in the human population revealed a higher rate of CNV in autistic individuals, but do not provide mechanistic understanding. As a complement to human studies, we have begun modeling of behaviors relevant to the autism phenotype in the laboratory mouse. The mouse model system provides a wealth of behaviors that vary across different inbred strains to capture the complexity of the genetic etiology of these behaviors. The opportunity to interrogate the mouse genome using the RIX mice will allow us more clearly to define the genetic underpinnings of autism-relevant behaviors. Furthermore, the investigation of GxE interactions proposed here may ultimately give us an understanding of which autistic individuals will either benefit or be resistant to intensive behavioral therapies in current use.</li>
</ul>
<ul>
<li><em><strong>Anxiety and depressive disorders</strong></em> are the most prevalent mental disorders and ample evidence indicates these mental illnesses often co-occur. In addition, multiple studies have shown a link between stress and the development of mood disorders. Of particular interest are data that both early life and ongoing stress are important risk factors for developing MDD. Although the genetic contribution to both anxiety and depression has been well-established in twin studies, progress in identification of specific genes in humans has been slow. Rodent models – open field for anxiety and forced swim test for depression – have been used extensively to screen for therapeutic activity and have been shown to have both predictive and trait validity. Studies in rodents subjected to different housing conditions that are either stressful (isolation) or provide social interaction and stimulation (enriched) show behavioral differences in stress, anxiety and depression, for example and provide the framework for our studies in RIX mice. The unprecedented array of genomic and genetic data available in mice, as well as advanced genetic models like the CC, present the opportunity to study the complex genetics and GxE interactions that contribute to mood disorders.</li>
</ul>
<ul>
<li><em><strong>Side-effects of antipsychotic pharmacogenetics</strong></em>. Antipsychotic medications are the mainstay of treatment for SCZ, but an astounding 75% of patients discontinue assigned treatments due to intolerable side effects and/or inefficacy over relatively short periods of time. Therefore, if it were possible to predict which patients were likely to develop side effects or fail to achieve a therapeutic response, drug treatment of schizophrenia would be more effective, safe, and cost-effective. Two major adverse drug reactions in pharmacotherapy for SCZ are tardive dyskinesia (high-potency, typical, first-generation antipsychotics) and weight gain (certain atypical, second-generation antipsychotics). There is substantial inter-individual variation in liability to these adverse drug reactions and direct and indirect evidence suggest a role for genetic variation. There is significant heterogeneity in therapeutic response to antipsychotics, with roughly equal proportions of patients experiencing remission, partial response, and no benefit. There are highly plausible mouse analogs to the human pharmacogenetic phenotypes: vacuous chewing movements are a widely-used rodent model for tardive dyskinesia, pre-pulse inhibition is widely believed to be a proxy for antipsychotic treatment efficacy in humans and rodents, and body mass and composition changes are reasonably comparable in human and mouse. The basic idea behind these experiments is to use the CISGen model to elucidate the basis of these clinically important pharmacogenomic phenotypes.</li>
</ul>
<p align="center"><a id="file-link-140" class="file-link image" title="vacuous chewing movements in humans" href="http://compgen.unc.edu/cisgen_videos/flash/patients_td/patients_td.html"> <img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/humanchew.thumbnail.png" alt="vacuous chewing movements in humans" width="200" height="112" /></a><a id="file-link-141" class="file-link image" title="vacuous chewing movements in mouse" href="http://compgen.unc.edu/cisgen_videos/flash/chewing/chewing.html"> <img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/mousechew.thumbnail.png" alt="vacuous chewing movements in mouse" width="200" height="112" /></a></p>
<p align="center"><strong>Click on either of the images above to see a video</strong></p>
<h2>Using the Collaborative Cross</h2>
<p>Through the Collaborative Cross (CC), we propose a new and superior mouse model to develop strong and specific mechanistic hypotheses for complex psychiatric phenotypes. We have intentionally chosen phenotypes which are notably difficult to study in humans yet realizable through the CC. The CC is a large panel of recently established recombinant inbred (RI) mouse lines specifically designed to overcome the limitations of existing genetic resources and to act as an optimal murine model of heterogeneous human populations. The CC captures the complexity of the mammalian genome and permits modeling the complex interactions with the environment that influence disease.</p>
<p align="center"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/cegs-fig3a.png" alt="cegs-fig3a.png" /></p>
<p>Most importantly, the CC is the only mammalian resource that has high and uniform genome-wide variation effectively randomized across a large, heterogeneous, and infinitely reproducible population which also supports integration across environmental and biological conditions, across genotypes, and over time. This resource provides the platform for the comprehensive analyses outlined in this proposal. In CISGen, we propose to use a novel extension of the CC to identify realistically complex genetic models for human psychiatric diseases. We will use the CC to explore the genomic search space in a manner impossible in humans in order to develop specific and high-confidence models for future human testing. Some important properties that make the CC an ideal system genetics platform are:</p>
<ul>
<li> <strong>Genomewide variation.</strong> In contrast to traditional RI lines, the CC was derived from a genetically diverse set of 8 founder inbred strains (A/J, C57BL6/J, 129S1/SvImJ, NOD/LtJ, NZO/HlLtJ, CAST/EiJ, PWK/PhJ and WSB/EiJ). This selection of founder strains is predicted to result in uniform high-levels of variation genome wide, and, unlike other RI panels, no genomic regions are identical in all CC founder strains.</li>
<li><strong>Genetic variation is randomized in the CC lines so that causal relationships can be established.</strong> Parental strains were bred using a combinatorial funnel design to yield a large number of genetically independent RI lines. This breeding design should lead to the generation of RI lines with many random perturbations of allele combinations via recombination and chromosomal assortment.</li>
<li><strong>Infinitely reproducible to support data integration and replication. </strong>As with any RI panel, the CC is an effectively immortal population of genetic clones as the genotypes in each RI line are fixed and any desired number of genotypes can be generated at will. Therefore, it is possible to use a common set of genotypes to reproduce and integrate studies under different environmental conditions (such as differences in housing, drugs, etc).</li>
<li><strong>Sufficiently large to support robust statistical analysis.</strong> Three sets of RI lines using the same overall breeding scheme were initiated by investigators in the US (Elissa Chesler, Oak Ridge National Laboratory), Israel (Fuad Iraqi and Richard Mott, Welcome Trust), and Australia (Grant Morahan) with the goal to generate 500 independent RI lines. The aim was to combine the surviving RI lines so that the final population size will have statistical power to map genetic factors associated with resistance or susceptibility that would not be possible using available mouse strains or RI lines.</li>
<li><strong>A Better Model for Human Disease – Inbred CC to Outbred RIX. </strong>Two additional features are central to CISGen. First, given that all CC mice are inbred, we devised a method to model outbred human populations. Given the proposed population of 360 CC lines, this approach can potentially give rise to almost 130,000 genetically distinct RIX individuals. Subsets of RIX can be used to evaluate biological predictions of how an individual will respond to environmental perturbations and provide statistical support for prediction accuracy. RIX are the ideal experimental population since they consist of a large number of genetically identical, but non-inbred lines. Consequently, they have genomic characteristics very similar to humans but are infinitely reproducible. Second, the use of a panel of RIX with the a new microarray that combines GWAS genotyping, allele-specific gene expression, and ongoing resequencing efforts in the founder strains provides a unique opportunity for investigating causes of gene expression differences in an outbred population (including epigenetic features like imprinting and X inactivation).</li>
</ul>
<p align="center"><a id="file-link-146" class="file-link image" title="rixexample.png" href="javascript:void(0)"></a><a title="rixexample.png" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/rixexample.png"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/rixexample.png" alt="rixexample.png" width="200" height="200" /> </a><a title="rixloop.png" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/rixloop.png"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/rixloop.png" alt="rixloop.png" /></a></p>
<p>More information about why the CC is the optimal platform for Mammalian Systems Genetics can be found on the <a href="http://compgen.unc.edu/wp/?page_id=99">Collaborative Cross Page.</a></p>
<h2>Preliminary Studies</h2>
<p>Over the past 20 years UNC has been at the forefront of mouse genetics. These efforts were recognized in 2007 with the award of the Nobel Prize to Dr. Oliver Smithies, Excellence Professor in the Department of Pathology. This commitment was reinforced in 2000 with the creation of the Department of Genetics and the Carolina Center for Genome Sciences. The CC has been identified as a strategic area for UNC in which to develop integrated approaches for the study and treatment of human complex diseases. We describe below prior work of direct relevance to this application using inbred mouse phenome strains:</p>
<h3>The Collaborative Cross</h3>
<p>The CISGen team has been at the forefront of proposing and developing the CC platform. They have also been instrumental in securing funding for the first set of CC lines, and in using the incipient lines for genetic studies. Moreover, we proposed the concept and demonstrated the crucial advantages of using RIX rather than the CC lines alone.</p>
<h3>Mouse Genomics</h3>
<ol>
<li><strong>Genomewide surveys in mouse:</strong> We published the most comprehensive analysis of genetic variation present in laboratory inbred strains (including the CC founder strains), and described the implications for complex traits analysis and systems genetics.</li>
<li><strong>Custom Arrays:</strong> We developed a custom-designed Affymetrix GeneChip® Mapping Array that gives us unprecedented ability to combine the analysis of different types of genetic variation (SNPs, CNVs) with gene expression using both standard approaches and an allele-specific expression.</li>
<li><strong>Systems genetics:</strong> Using existing panels of RI mice, our team performed the initial proof-of-concept experiments in systems genetics by generating transcriptional networks and associating them with specific phenotypic traits. Most efforts used the BXD panel of RI strains derived from C57BL/6J and DBA/2J. Expression levels of genes in forebrain (12,000 genes) and liver (24,000 genes) from 34 BXD strains were measured – each transcript was considered a quantitative trait and we used interval mapping to identify genetic regulators of inter-individual expression differences (termed expression QTL or eQTL).</li>
<li><strong>Imputation resources: </strong>We have developed an imputed genotype resource that combines high quality genotypes from multiple sources and which includes the 8 founder CC strains.</li>
<li><strong>Allele-specific gene expression (ASE):</strong> Allelic variants in genes that alter susceptibility for behavioral phenotypes or differential effects of housing conditions on these phenotypes will likely have functional coding polymorphisms or be cis-regulated through promoter or transcript stability polymorphisms. Potential functional coding polymorphisms are identified computationally in the CC founder strains and most have been cataloged. Regulatory polymorphisms that act in cis can be detected by ASE.</li>
</ol>
<h3>Social Behavior in the Mouse</h3>
<p>As social behavior is perhaps the most complex behavior in mouse, we propose three overlapping assessments. On advice of the CISGen statisticians, this approach is preferred – the resulting measures can be combined statistically (e.g., dimensionality reduction techniques) to yield more powerful phenotypes. Either way,  multimodal assessment is crucial. Mouse social behavior has been modeled with the sociability and preference for social novelty tasks. These are choice tasks where a mouse can move freely through a three-chambered apparatus. One side chamber has an unfamiliar mouse in a small wire cage, the center chamber is empty, and the other side chamber contains an empty wire cage.</p>
<p style="text-align: center"><a style="text-decoration: none" href="http://compgen.unc.edu/cisgen_videos/flash/mousesniff/mousesniff.html"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/sniff.thumbnail.jpg" alt="sniff mouse video image" /></a> <a href="http://compgen.unc.edu/cisgen_videos/flash/preccsocial/preccsocial.html"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/preccsocial.thumbnail.jpg" alt="presocial cc video image" /></a> <a href="http://compgen.unc.edu/cisgen_videos/flash/socialmouse/socialmouse.html"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/socialmouse.thumbnail.jpg" alt="social mouse video image" /></a></p>
<p align="center"><strong>Click on any of the three images above to see a video</strong></p>
<p>Measures are time spent on each side, entries into each side, and time spent sniffing the wire cage setup over a ten-minute trial. For the preference for social novelty task, a second stranger mouse is placed in the previously empty cage. The measures are repeated for the second 10-minute trial immediately following the sociability task. Preference for social novelty is defined as spending more time on the side with stranger mouse 2 than on the side with the now familiar stranger mouse 1. In this task, preference was shown in 10 of 17 strains on the measure of duration. Some strains that show sociability do not show preference for social novelty (e.g., C3H/HeJ and SWR/J) and some strains demonstrate preference for social novelty but not sociability (e.g., BTBR, NZB/B1NJ and SJL/J). Thus, these complex behaviors are genetically separable. As a third approach to social behavior, we developed a social coding system based on mice selectively bred for high (NC900) and low aggression (NC100). This validated system measures social motivation in attacking and non-attacking mice on a Likert scale (-3 – +3, negative aversive responses to positive affiliative responses). Not all RIX male mice will attack in the primary social integration test. Although we will have quantitative continuous variation within the attackers (e.g., number of attacks), we can now also assess social interaction phenotypes within the non-attackers. While attack frequency and latency are important analytic phenotypes, we can analyze quantitative variation across the full spectrum of non-attackers and attackers.</p>
<h3>The Genetics of Depression &amp; Anxiety-Like Behaviors in Mouse</h3>
<p>Numerous studies have reported a change in anxiety- and depression-related behaviors and hypothalamic-pituitary-adrenal (HPA) axis reactivity as a result of home cage environment. We have chosen standard behaviors in mice that have been proven to have validity as models of anxiety (open field), depression (forced swim test) and stress reactivity (acute restraint stress). Our laboratory has a great deal of experience with these behaviors. We have collected data on the open field, forced swim test and stress reactivity in 6 of the 8 CC founder strains. Collection of male and female data provides information on sex differences that could be another possible environmental factor for consideration in these studies. All of the behaviors proposed vary significantly across strain and the forced swim test varies by sex as well. Interestingly, a comparison between behaviors yielded a significant correlation between time spent in the center of the open field and baseline corticosterone levels. Strains with higher baseline corticosterone (indicating increased basal stress levels) are more anxious as reflected by less time spent in the center of the open field. This relationship between basal stress and anxiety might be expected based on similarly correlated changes in these two behaviors in response to housing conditions.</p>
<h3>Vacuous Chewing Movements</h3>
<p>We are in the process of completing a pilot study for a mouse model of antipsychotic pharmacogenetics. Our preliminary results and conclusions are below.</p>
<ol>
<li><strong>Drug delivery.</strong> Our goal for haloperidol administration was to achieve human-like steady state plasma levels (10-50 nM). This is difficult to achieve in mice by repeated injection due to rapid drug metabolism, and it is preferable to use a continuous-release technology. We deliver antipsychotics via subcutaneous, slow release drug pellets (Innovative Research of America, Sarasota, Florida). Each mouse is dosed for 60 days. As a pilot study, we implanted five C57BL6/J mice with haloperidol pellets and collected plasma and brain tissue after 30 days. Plasma levels were within the human therapeutic range and the coefficient of variation was 20.6% (11.6% in brain) and far lower than for other routes of administration such as injection (34%), drinking water (44-87%), and mini-pump (45.2%).</li>
<li><strong>Haloperidol-induced VCMs in mice resemble human TD.</strong> We next determined the time course of VCMs in mice. Haloperidol or placebo pellets were implanted in C57BL6/J (5/group). Mice given haloperidol displayed significantly more VCMs than placebo-treated mice (p&lt;0.001) and this effect was present not only on day 22 (p&lt;0.001), but persisted 30 days beyond the expected life of the drug pellet (60 days, p&lt;0.001). Furthermore, the within-strain variability is relatively low for behavioral studies (CV = 19%). We reviewed high resolution digital tapes of VCMs with our colleague Dr. Kirk Wilhelmsen (a board-certified neurologist with an interest in human movement disorders like TD), and he concluded that VCMs are a precise analog of human TD.</li>
<li><strong>Strain differences in VCM. </strong>In a pilot study, we examined two inbred mouse strains with often divergent responses to psychotropic compounds (A/J and 129S1/SvIMJ). Haloperidol pellets were implanted in five mice per group. Comparing day 0 to day 30, A/J mice were very sensitive to haloperidol-induced VCMs whereas 129S1/SvIMJ mice are not (p&lt;0.001).</li>
<li><strong>Relevant phenotypes in mouse phenome strains. </strong>In a modest extension of our pilot study, we administered haloperidol to 5 male mice in 17 phenome strains. (a) We assessed plasma haloperidol at day 30. There were substantial strain differences (p = 9e-11, heritability = 67%, Figure 4L). (b) We assessed haloperidol-induced extrapyramidal symptoms (EPS, via time to paw movement when placed on a flat screen at a 45° angle). EPS are a known acute effect of haloperidol. There were substantial strain effects (p = 1e-21, heritability = 85%) that were unchanged when plasma haloperidol at day 30 (p = 0.78) was included in the model. (c) Coding of tapes for VCMs is in progress. Thus, we have demonstrated: (a) we can deliver haloperidol effectively at human-like steady-state concentrations; (b) mouse VCMs are highly analogous to human tardive dyskinesia; (c) early data suggest substantial strain differences in VCMs; (d) steady-state plasma haloperidol concentrations is a trait with relatively high heritability; and (e) another adverse drug reaction of haloperidol (EPS) is highly heritable and independent of plasma concentrations.</li>
</ol>
<h3>Human GWAS for Psychiatric Disorders</h3>
<p>The CISGen team has considerable expertise in all aspects of human GWAS and has primary roles in two GWAS for schizophrenia and in two for major depressive disorders. Dr. Sullivan chairs the Coordinating Committee and the MDD working group of the Psychiatric GWAS Consortium (PGC). The purpose of the PGC is to conduct meta-analyses of individual genotype and phenotype data for five critically important psychiatric disorders (ADHD, autism, bipolar disorder, MDD, and SCZ) – there are 47 samples and 59,000 independent cases and controls for a carefully designed and conducted set of meta-analyses by Q4 2008. The PGC has 111 participating scientists from 48 institutions in 12 countries, and includes all known academic and industry GWAS for these disorders. Dr. Sullivan&#8217;s involvement in these GWAS efforts led directly to discussions with Dr. Pardo-Manuel de Villena which ultimately resulted in this CISGen CEGS application. Any firm conclusions are premature, but early results suggest that highly significant and field-changing findings are not as readily apparent for psychiatric disorders as for other human complex diseases. This observation underscores the importance of CISGen.</p>
<h2>Biostatistics, Computation, Database, &amp; Visualization</h2>
<p>The CISGen team is highly experienced in microarray analysis, genetic mapping (genomewide linkage and association in mice and humans), applied statistics, statistical genomics, and improving computational efficiency for complex data manipulation. Working with other team members, the computational teams at UNC-Chapel Hill are focused not only on the analysis of data and modeling but also on the development of publicly available tools for use by scientists to aide in research. While many tools have been developed, a few examples of web-based tools are shown below:</p>
<ul>
<li><a href="http://compgen.unc.edu/DisplayIntervals">Strain Sequence Identity Interval Viewer</a> - A web-based tool used to view common mouse laboratory strain IBD intervals.</li>
</ul>
<p align="center"><a title="IBD viewer" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/seq_viewer.png"><img style="width: 442px; height: 348px;" src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/seq_viewer.png" alt="IBD viewer" width="434" height="347" /></a></p>
<ul>
<li><a href="http://compgen.unc.edu/compatinv">Compatibility Intervals</a> - The Compatibility Intervals application will provide both a visualization and data representing “compatible” intervals between selected laboratory and wild mouse strains.</li>
</ul>
<p align="center"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/maxk.png" alt="maxk.png" width="240" height="214" /></p>
<ul>
<li><a href="http://compgen.unc.edu/treeqa/">Phylogeny-based GWAS</a> - We have developed a quantitative GWA mapping algorithm, TreeQA, which utilizes local perfect phylogenies constructed in genomic regions exhibiting no evidence of historical recombination.</li>
</ul>
<p align="center"><a title="moz3.jpg" href="http://compgen.unc.edu/treeqa/"><img style="width: 412px; height: 364px;" src="http://compgen.unc.edu/wp/wp-content/uploads/2008/10/moz3.jpg" alt="moz3.jpg" width="408" height="404" /></a></p>
<ul>
<li>Assigning Genotype Sequences to CC Founder Haplotypes - We have developed two dynamic programming algorithms to find the optimal assignment of genotypes to their founding CC strain prior to being fully inbred. Our algorithms incorporate constraints due to the funnel breeding structure and guarantee a minimum segment solution.</li>
</ul>
<p align="center"><a title="1536m143.png" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1536m143.png"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1536m143.thumbnail.png" alt="1536m143.png" /></a><a title="1536m142.png" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1536m142.png"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1536m142.thumbnail.png" alt="1536m142.png" /></a><a title="1519m154.png" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1519m154.png"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1519m154.thumbnail.png" alt="1519m154.png" /></a><a title="1515m165.png" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1515m165.png"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1515m165.thumbnail.png" alt="1515m165.png" /></a><a title="1515m164.png" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1515m164.png"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1515m164.thumbnail.png" alt="1515m164.png" /></a><a title="1496m138.png" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1496m138.png"><img src="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/1496m138.thumbnail.png" alt="1496m138.png" /></a></p>
<ul>
<li>For a complete list of analysis tools currently under development visit our <a href="http://compgen.unc.edu/wp/?page_id=10">Projects Page</a>.</li>
</ul>
<p align="center"><a title="IBD viewer" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/11/seq_viewer.png"><br />
</a></p>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=135</wfw:commentRss>
		</item>
		<item>
		<title>People</title>
		<link>http://compgen.unc.edu/wp/?page_id=4</link>
		<comments>http://compgen.unc.edu/wp/?page_id=4#comments</comments>
		<pubDate>Thu, 05 Jul 2007 23:08:21 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.compgen.unc.edu/blog/?page_id=4</guid>
		<description><![CDATA[
Biomedical Engineering

Shawn Gomez



Biostatistics

Fred Wright
Fei Zou











Computer Science

Yi Liu
Leonard McMillan
Isa-Kemal Pakatci
Jan Prins
Abhishek Sarkar
Darshan Singh
Jeremy Wang
Wei Wang
Catie Welsh
Xiang Zhang
Zhaojun Zhang

























Environmental Science and Engineering

Daniel Gatti
Ivan Rusyn











Genetics

David Aylor
Ryan Buus
John Calaway
John Didion
Fernando Pardo-Manuel de Villena
Darla Miller
Daniel Pomp
Jason Spence
David Threadgill
Alex Vu
Kirk C Wilhelmsen
Yuying Xie















Lineberger Comprehensive Cancer Center

Todd Taft










Statistics and Operations Research

Yufeng Liu



Alumni

Andrew Hulbert
Daniel Kumar
Jinze Liu
Kyle Moore
Feng Pan
Joel Parker
Adam Roberts
Lynda Yang
Tynia Yang
Qi Zhang
























]]></description>
			<content:encoded><![CDATA[<div style="clear:both">
<h2>Biomedical Engineering</h2>
<ul style="list-style:none; float:left">
<li><a href="http://gomezlab.bme.unc.edu/">Shawn Gomez</a></li>
</ul>
</div>
<div style="clear:both">
<h2>Biostatistics</h2>
<ul style="list-style-type: none; list-style-image: none; list-style-position: outside; float: left; width: 250px;">
<li><a href="http://genomics.unc.edu/wright/wright_dw.htm">Fred Wright</a></li>
<li><a href="http://genomics.unc.edu/zou/zou_dw.htm">Fei Zou</a></li>
</ul>
<table style="float:left" border="0">
<tbody>
<tr>
<td><img src="/images/people/fred.png" alt="Fred Wright" title="Fred Wright" /></td>
<td><img src="/images/people/fei.png" alt="Fei Zou" title="Fei Zou" /></td>
</tr>
</tbody>
</table>
</div>
<div style="clear:both">
<h2>Computer Science</h2>
<ul style="list-style-type: none; list-style-image: none; list-style-position: outside; float: left; width: 250px;">
<li><a href="http://www.celeste.cn/">Yi Liu</a></li>
<li><a href="http://www.cs.unc.edu/~mcmillan/">Leonard McMillan</a></li>
<li>Isa-Kemal Pakatci</li>
<li><a href="http://www.cs.unc.edu/~prins/">Jan Prins</a></li>
<li><a href="http://www.unc.edu/~asarkar/">Abhishek Sarkar</a></li>
<li>Darshan Singh</li>
<li><a href="http://www.cs.unc.edu/~jrwang/">Jeremy Wang</a></li>
<li><a href="http://www.cs.unc.edu/~weiwang/">Wei Wang</a></li>
<li><a href="http://www.cs.unc.edu/~cwelsh/">Catie Welsh</a></li>
<li><a href="http://www.cs.unc.edu/~xiang/">Xiang Zhang</a></li>
<li>Zhaojun Zhang</li>
</ul>
<table style="float:left" border="0">
<tbody>
<tr>
<td><img style="height: 82px;" src="/images/people/liuyi.jpg" alt="Yi Liu" title="Yi Liu" /></td>
<td><img src="/images/people/leonard.png" alt="Leonard McMillan" title="Leonard McMillan" /></td>
<td><img src="/images/people/kemal.png" alt="Isa-Kemal Pakatci" title="Isa-Kemal Pakatci" /></td>
</tr>
<tr>
<td><img src="/images/people/prins.png" alt="Jan Prins" title="Jan Prins" /></td>
<td><img src="/images/people/darshan.jpg" alt="Darshan Singh" title="Darshan Singh" /></td>
<td><img src="/images/people/jeremy-wang.jpg" alt="Jeremy Wang" title="Jeremy Wang" /></td>
</tr>
<tr>
<td><img src="/images/people/wei.png" alt="Wei Wang" title="Wei Wang" /></td>
<td><img src="/images/people/catie.png" alt="Catie Welsh" title="Catie Welsh" /></td>
<td><img src="/images/people/xiang.PNG" alt="Xiang Zhang" title="Xiang Zhang" /></td>
</tr>
<tr>
<td colspan="3"><img src="/images/people/zzj.jpg" alt="Zhaojun Zhang" title="Zhaojun Zhang" /></td>
</tr>
</tbody>
</table>
</div>
<div style="clear:both">
<h2>Environmental Science and Engineering</h2>
<ul style="list-style-type: none; list-style-image: none; list-style-position: outside; float: left; width: 250px;">
<li>Daniel Gatti</li>
<li><a href="http://www.unclineberger.org/research/faculty/displayMember.asp?ID=398">Ivan Rusyn</a></li>
</ul>
<table style="float:left" border="0">
<tbody>
<tr>
<td><img src="/images/people/gatti.png" alt="Daniel Gatti" title="Daniel Gatti" /></td>
<td><img src="/wp/wp-content/uploads/2009/04/rusyn1.png" alt="Ivan Rusyn" title="Ivan Rusyn" /></td>
</tr>
</tbody>
</table>
</div>
<div style="clear:both">
<h2>Genetics</h2>
<ul style="list-style-type: none; list-style-image: none; list-style-position: outside; float: left; width: 250px;">
<li>David Aylor</li>
<li>Ryan Buus</li>
<li>John Calaway</li>
<li>John Didion</li>
<li><a href="http://genetics.unc.edu/faculty/pardo.htm">Fernando Pardo-Manuel de Villena</a></li>
<li>Darla Miller</li>
<li><a href="http://genetics.unc.edu/faculty/pomp">Daniel Pomp</a></li>
<li>Jason Spence</li>
<li><a href="http://genetics.unc.edu/faculty/david-threadgill">David Threadgill</a></li>
<li>Alex Vu</li>
<li>Kirk C Wilhelmsen</li>
<li>Yuying Xie</li>
</ul>
<table style="float:left" border="0">
<tbody>
<tr>
<td><img src="/images/people/fernando.png" alt="Fernando Pardo-Manuel de Villena" title="Fernando Pardo-Manuel de Villena" /></td>
<td><img src="/images/people/daniel-pomp.jpg" alt="Daniel Pomp" title="Daniel Pomp" /></td>
<td><img src="/images/people/david.png" alt="David Threadgill" title="David Threadgill" /></td>
</tr>
<tr>
<td colspan=3><img src="/images/people/yuying.png" alt="Yuying Xie" title="Yuying Xie" /></td>
</tr>
</tbody>
</table>
</div>
<div style="clear:both">
<h2>Lineberger Comprehensive Cancer Center</h2>
<ul style="list-style-type: none; list-style-image: none; list-style-position: outside; float: left; width: 250px;">
<li><a href="http://www.cs.unc.edu/~taft">Todd Taft</a></li>
</ul>
<table style="float:left" border="0">
<tbody>
<tr>
<td><img src="http://compgen.unc.edu/wp/wp-content/uploads/2007/07/ToddTaft.png" alt="Todd Taft" title="Todd Taft" /></td>
</tr>
</tbody>
</table>
</div>
<div style="clear:both">
<h2>Statistics and Operations Research</h2>
<ul style="list-style-type: none; list-style-image: none; list-style-position: outside; float: left; width: 250px;">
<li><a href="http://www.unc.edu/~yfliu/">Yufeng Liu</a></li>
</ul>
</div>
<div style="clear:both; overflow:hidden">
<h2>Alumni</h2>
<ul style="list-style-type: none; list-style-image: none; list-style-position: outside; float: left; width: 250px;">
<li><a href="http://www.cs.unc.edu/~hulbert/">Andrew Hulbert</a></li>
<li><a href="http://www.cs.unc.edu/~ndkumar/">Daniel Kumar</a></li>
<li><a href="http://www.cs.unc.edu/~liuj/">Jinze Liu</a></li>
<li><a href="http://www.cs.unc.edu/~kjmoore/">Kyle Moore</a></li>
<li><a href="http://www.cs.unc.edu/~panfeng/">Feng Pan</a></li>
<li>Joel Parker</li>
<li>Adam Roberts</li>
<li><a href="http://www.cs.unc.edu/~yangl/">Lynda Yang</a></li>
<li>Tynia Yang</li>
<li>Qi Zhang</li>
</ul>
<table style="float:left" border="0">
<tbody>
<tr>
<td><img src="/wp/wp-content/uploads/2008/07/andrew-hulbert.png" alt="Andrew Hulbert" title="Andrew Hulbert" /></td>
<td><img src="/images/people/daniel.png" alt="Daniel Kumar" title="Daniel Kumar" /></td>
<td><img src="/images/people/jinze.png" alt="Jinze Liu" title="Jinze Liu" /></td>
</tr>
<tr>
<td><img src="/images/people/kyle.png" alt="Kyle Moore" title="Kyle Moore" /></td>
<td><img src="/images/people/feng.png" alt="Feng Pan" title="Feng Pan" /></td>
<td><img src="/images/people/joel.png" alt="Joel Parker" title="Joel Parker" /></td>
</tr>
<tr>
<td><img src="/images/people/adam.png" alt="Adam Roberts" title="Adam Roberts" /></td>
<td><img src="/images/people/lynda.png" alt="Lynda Yang" title="Lynda Yang" /></td>
<td><img src="/images/people/zhangq.jpg" alt="Qi Zhang" title="Qi Zhang" /></td>
</tr>
<tr>
<td colspan="3"><img src="/images/people/tynia.jpg" alt="Tynia Yang" title="Tynia Yang" /></td>
</tr>
</tbody>
</table>
</div>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=4</wfw:commentRss>
		</item>
		<item>
		<title>Full-Genome SNP Compatibility</title>
		<link>http://compgen.unc.edu/wp/?page_id=71</link>
		<comments>http://compgen.unc.edu/wp/?page_id=71#comments</comments>
		<pubDate>Fri, 03 Aug 2007 17:58:00 +0000</pubDate>
		<dc:creator>Kyle Moore</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/?page_id=71</guid>
		<description><![CDATA[Genome-Wide Compatibility Region Viewer
Introduction
The local block-structure of haplotypes within a population sheds light on many important biological questions. Haplotype blocks are central to quantifying and localizing recombinations (both recent and historical), they are widely used to identify maximally informative marker sets [38], and are essential building blocks for constructing genetic maps. Haplotype-block structure also underlies [...]]]></description>
			<content:encoded><![CDATA[<h2>Genome-Wide Compatibility Region Viewer</h2>
<p><strong>Introduction</strong></p>
<p>The local block-structure of haplotypes within a population sheds light on many important biological questions. Haplotype blocks are central to quantifying and localizing recombinations (both recent and historical), they are widely used to identify maximally informative marker sets [38], and are essential building blocks for constructing genetic maps. Haplotype-block structure also underlies many methods of genomewide association study, provides fundamental biochemical evidence for genetic selection, and offers a tool for ascertaining the ancestral origins of a population.</p>
<p>The task of decomposing a genome into meaningful blocks, however, has proven to be ill-defined, inconsistent, and often ambiguous. In part, the problem resides in the ad hoc definition of what constitutes a haplotype block. Haplotype blocks are often defined to serve a specific purpose. Examples include the minimum number of tagging SNPs sufficient to capture the majority of haplotypes, intervals of SNPs surrounding core SNPs that exceed a given threshold of Linkage Disequilibrium (LD), and maximal regions whose haplotype diversity falls below a threshold. Partitioning haplotypes into blocks supporting perfect phylogenies, and, the related, selection of blocks lacking evidence for recombination have been used in support of genotype phasing and for constructing Ancestral Recombination Graphs (ARGs).</p>
<p>We propose an unambiguous definition for haplotype blocks and efficient methods for computing them. Where ambiguity is unavoidable, we have uncovered properties that are common to all solutions. Our haplotype block definition directly supports, and has been used for, association mapping, construction of genetic maps, and determining<br />
the ancestral origins within local genomic regions.</p>
<p>We assume the availability of haplotype data, which is problematic for human genotypes. However, dense SNP data sets that are homozygous at every allele are readily available for many inbred mammal and plant models commonly used for association mapping. It is unnecessary to phase such data sets, however, it is still important to identify haplotype blocks for exploring the local diversity structures, and ancestral origins. Like<br />
others, our haplotype blocks are chosen for their lack of historical recombination evidence.</p>
<p>We define SNP compatibility in terms of the Four-Gamete Test (FGT). The FGT is of particular interest because of its close relation to perfect phylogeny. Specifically, a necessary and sufficient condition for a perfect<br />
phylogeny is that all pairs of SNPs satisfy the FGT. We partition the genome into a set of potentially overlapping, maximal compatible intervals, each of which admits a perfect phylogeny, and whose union covers the full data set. We address the question of what is the fewest number of such intervals required, and we also identify suspect SNPs whose removal reduces the overall complexity of the haplotype-block structure (perhaps indicating genotyping errors, homoplasy, or gene conversions).</p>
<p>Our contribution is an analysis of the problem of dividing a genome into compatible intervals and its computational complexity. We provide an achievable lower-bound on the number of such intervals. While in general there are numerous ways of dividing a genome into a minimum number of compatible intervals (a fact overlooked by others), we also identify non-overlapping core subintervals that are common to all valid solutions. We also define a specific interval set that achieves the interval lower-bound, yet maximizes the block overlap, thus minimizing the number of perfect phylogeny trees, while providing the richest possible set of contributing SNPs to each tree.</p>
<p><a href="http://compgen.unc.edu/images/CompatV/ChrY_HMM.gif"><img title="Chromosome Y" src="http://compgen.unc.edu/images/CompatV/ChrY_HMM_400.png" border="0" alt="Chromosome Y" /></a><br />
Shown above is a Compatibility Matrix for Chromosome Y - 1420 SNPs (To Enlarge: Right Click &gt; Save Link As&#8230;)</p>
<p>A <a href="http://compgen.unc.edu/wp/wp-content/uploads/2009/08/compat_anim.avi">movie</a> illustrating genome-scale compatibility.<br />
Try our <a href="http://compgen.unc.edu/CompatInv/">demo</a> online.</p>
<h2>Research Sponsor</h2>
<p><a href="http://compgen.unc.edu/wp/?page_id=483"><strong>NSF IIS 0534580:</strong> &#8220;Visualizing and Exploring High-dimensional Data&#8221;</a><br />
<a href="http://compgen.unc.edu/wp/?page_id=354"><strong>NIH GM 076468:</strong> &#8220;The Center for Genome Dynamics at Jackson Laboratory:An NIGMS National Center of Systems Biology&#8221;</a></p>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=71</wfw:commentRss>
		</item>
		<item>
		<title>Visualizing and Exploring High-dimensional Data</title>
		<link>http://compgen.unc.edu/wp/?page_id=483</link>
		<comments>http://compgen.unc.edu/wp/?page_id=483#comments</comments>
		<pubDate>Fri, 21 Aug 2009 21:15:45 +0000</pubDate>
		<dc:creator>jjsun</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/wp/?page_id=483</guid>
		<description><![CDATA[
NSF IIS 0534580 (September 1, 2006 ~ August 31, 2010) 
The aim of this project is to develop new methods for interactively exploring relationships within large high-dimensional data sets, such as those typical of high-throughput scientific experiments. The resulting tools will provide an aid to scientists prior to applying traditional offline data-analysis techniques such as [...]]]></description>
			<content:encoded><![CDATA[<p><img src="images/sponsorLogo/nsf.gif" alt="" /><br />
<strong><a href="http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0534580">NSF IIS 0534580</a> (September 1, 2006 ~ August 31, 2010) </strong></p>
<p>The aim of this project is to develop new methods for interactively exploring relationships within large high-dimensional data sets, such as those typical of high-throughput scientific experiments. The resulting tools will provide an aid to scientists prior to applying traditional offline data-analysis techniques such as clustering, segmentation, and classification. Scientists will be able to explore hypotheses and incorporate their own knowledge to drive traditional unsupervised data-mining algorithms in sensible and more promising directions. The visualization tools will assist scientist in many disciplines, including biologists in studying gene function, medical doctors in comprehending disease susceptibility, chemists in developing candidate drugs, and high-energy physics in analyzing the data generated by particle accelerators.</p>
<p>A key component of the novel approach is the ability to interactively explore parameter spaces and combine attributes of high-dimensional data points. The visualization tool will provide two alternate views of the data sets: a dissimilarity-matrix view that offers insights into the size, compactness, separation, and relative proximity of clusters, and a point-cloud view that provides a 3-D projection of the high-dimensional source data that best preserve the distance between points. This dual-view approach excels in communicating the flow and migrations of points from one cluster to another as parameters are tuned. It also allows the user to probe and interact with the data, including such tasks as hand clustering the data, and examining particular points. The resulting visualization tools will support dynamic cluster formation and migration as the contributions of various data set features are interactively modified. The project provides an excellent interdisciplinary education and research environment, and the collaborative nature of the project also enhances the potential for results dissemination.</p>
<h2>Project Personnel</h2>
<p><strong>Principal Investigators:</strong></p>
<ul>
<li><a href="http://www.cs.unc.edu/~mcmillan/">Leonard McMillan</a> (PI)</li>
<li><a href="http://www.cs.unc.edu/~weiwang/">Wei Wang</a> (co-PI)</li>
</ul>
<p><strong>Collaborators:</strong></p>
<ul>
<li>David Threadgill (Genetics, NC State)</li>
<li>Fernando Pardo Manuel de Villena (Genetics, UNC)</li>
</ul>
<p><strong>Students:</strong></p>
<ul>
<li>Shriram           Alapathy</li>
<li>Jeremy          R Wang</li>
<li>Catherine         Welsh</li>
<li>Xiang             Zhang (Microsoft Ph.D. Fellowship winner)</li>
</ul>
<p><strong>Alumni:</strong></p>
<ul>
<li>Jinze             Liu (Ph.D. 2006, Post Doc. 2007, Assistant Professor University of Kentucky)</li>
<li>Kyle              Moore (MS 2007)</li>
<li>Mengsheng         Zhang (MS 2008)</li>
<li>Feng              Pan (Ph.D. 2009)</li>
<li>Lynda             Yang (B.S. 2008, NSF Fellowship, currently at UIUC)</li>
<li>Tynia Yang (M.S. 2006)</li>
<li>Adam              Roberts (B.S. 2007, NSF Fellowship currently at UC-Berkeley)</li>
<li>Qi                Zhang (Ph.D. 2009)</li>
</ul>
<p><strong>Projects:</strong></p>
<ul>
<li><a href="http://compgen.unc.edu/wp/?page_id=58">HiDimViewer</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=57">NPUTE</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=68">Xbox Science</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=71">Full-Genome SNP Compatibility</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=228">Strain Sequence Identity Interval Viewer</a></li>
<li><a href="http://compgen.unc.edu/wp/?page_id=256">Inferring Genome-wide Mosaic Structure</a></li>
</ul>
<h2>Publications</h2>
<ol>
<li><a title="Click to download PDF" href="http://compgen.unc.edu/wp/wp-content/uploads/2007/07/ismb2007.pdf" target="_blank">Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows</a>, by Adam Roberts, Leonard McMillan, Wei Wang, Joel Parker, Ivan Rusyn, and David Threadgill, <strong><em>Proceedings of the 15th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB)</em></strong>, 2007.</li>
<li><a title="Click to download PDF" href="http://compgen.unc.edu/wp/wp-content/uploads/2007/PanICDM07.pdf">Sample selection for maximal diversity</a>, by Feng Pan, Adam Roberts, Leonard McMillan, Fernando Pardo Manuel de Villena, David Threadgill, and Wei Wang, <em><strong>2007 IEEE International Conference on Data Mining (ICDM&#8217;07).</strong></em></li>
<li>The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics, by Adam Roberts, Fernando Pardo-Manuel de Villena, Wei Wang, and Leonard McMillan, and David W. Threadgill, <em>in <strong>Mammalian Genome</strong></em>, vol. , (2007)</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/SIGMOD08.pdf">CRD: fast co-clustering on large datasets utilizing sample-based matrix decomposition</a>, by Feng Pan, Xiang Zhang, and Wei Wang, <strong><em>Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD)</em></strong>, 2008, p. 173.</li>
<li><a title="Click to download PDF" href="http://compgen.unc.edu/wp/wp-content/uploads/2007/08/sdm07_21.pdf">Poclustering: lossless clustering of dissimilarity data</a>, by Jinze Liu, Qi Zhang, Wei Wang, Leonard McMillan, and Jan Prins, <strong><em>Proceedings of 2007 SIAM International Conference on Data Mining (SDM2007)</em></strong>, 2007.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/ICDE08_2.pdf">Mining approximate order preserving clusters in the presence of noise</a>, by Mengsheng Zhang, Wei Wang, and Jinze Liu, <strong><em>Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE)</em></strong>, 2008, p. 160</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2007/ICDE07_2.pdf">Accelerating Profile Queries in Elevation Maps</a>, by Pan Feng, Wei Wang, and Leonard McMillan, <strong><em>International Conference on Data Engineering (ICDE 2007)</em></strong>, 2007.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/07/care_icde08.pdf">CARE: Finding Local Linear Correlations in High Dimensional Data</a>, by Xiang Zhang, Feng Pan, and Wei Wang, <a href="http://www.cs.unc.edu/~xiang/publications/CARE_ICDE08.pdf"></a><em><strong><span>Proceedings of</span> 2008 International Conference on Data Engineering (ICDE&#8217;08).</strong></em></li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2009/SSDBM09.pdf">Split-order distance for clustering and classification hierarchies</a>, by Zhang, Q., Liu, E. Y., Sarkar, A., and Wang, W., <strong><em>Proceedings of the 21st International Conference on Scientific and Statistical Database Management (SSDBM)</em></strong>, 2009, p. 517.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/09/cikm08.pdf">REDUS: finding reducible subspaces in high dimensional data</a>, by Xiang Zhang, Feng Pan, and Wei Wang. <em><strong>Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM&#8217;08).</strong></em></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=483</wfw:commentRss>
		</item>
		<item>
		<title>Projects</title>
		<link>http://compgen.unc.edu/wp/?page_id=10</link>
		<comments>http://compgen.unc.edu/wp/?page_id=10#comments</comments>
		<pubDate>Thu, 12 Jul 2007 17:54:53 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/?page_id=10</guid>
		<description><![CDATA[HiDimViewer

HiDimViewer is a visualization tool we are developing for high-dimensional datasets. It is designed to be used as an interactive data exploration tool to aid scientists in selecting and observing clusters in high-dimensional data.


NPUTE

NPUTE is an efficient data structure we have developed for finding pair-wise haplotype similarity. Its simplicity can lead to benefits in speed [...]]]></description>
			<content:encoded><![CDATA[<h2><a href="http://compgen.unc.edu/?page_id=58" target="_blank">HiDimViewer</a></h2>
<p><img src="images/pj_gui.gif" style="float:left;margin:20px;"></p>
<p>HiDimViewer is a visualization tool we are developing for high-dimensional datasets. It is designed to be used as an interactive data exploration tool to aid scientists in selecting and observing clusters in high-dimensional data.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=57" target="_blank">NPUTE</a></h2>
<p><img src="images/pj_npute.jpg" style="float:left;margin:20px;"></p>
<p>NPUTE is an efficient data structure we have developed for finding pair-wise haplotype similarity. Its simplicity can lead to benefits in speed and exhaustive searches over multiple parameters.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/wp/?page_id=390" target="_blank">Genetic Diversity <em>of Mus musculus</em> Laboratory Strains</a></h2>
<p><img src="images/pj_CCDiversity.png" style="float:left;margin:20px;"></p>
<p>The most commonly used resources harbor only a fraction of Mus musculus genetic diversity, which is not uniformly distributed resulting in many blind spots. Only resources that include wild-derived inbred strains from subspecies other than M. m. domesticus have no blind spots and uniform distribution of the variation. Unlike other resources that are primarily suited for gene discovery, the CC is the only resource that can support genome-wide network analysis, which is the foundation of systems genetics. </p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=68" target="_blank">XBox Science</a></h2>
<p><img src="images/pj_founders.png" style="float:left;margin:20px;"></p>
<p>In XBox Science, we are exploring the potential of employing game interfaces, game-design principles, and game production approaches for constructing bioinformatics tools.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=17" target="_blank">snpBrowser</a></h2>
<p><img src="images/pj_snpbrowser_magnify_thumb.png" style="float:left;margin:20px;"></p>
<p><em>SnpBrowser</em> is an application designed to analyze and visualize the immense SNP datasets that are currently available. It provides modes for analyzing genetic diversity, marker segregation, strain selection, and QTL mapping.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=71" target="_blank">Full-Genome SNP Compatibility</a></h2>
<p><img src="images/pj_maxk.png" style="float:left;margin:20px;"></p>
<p>We are developing methods for partitioning a genome into blocks for which there are no apparent recombinations. Thus providing parsimonious sets of compatible genome intervals based on the four-gamete test. We have developed theory and methods for dividing a genome into compatible intervals and also developed the notion of an interval set that achieves an interval lower-bound, yet maximizes interval overlap. </p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=239" target="_blank">Tree-based Genome-wide Association Mapping</a></h2>
<p><img src="images/pj_moz3.jpg" style="float:left;margin:20px;"></p>
<p>In this project, we developed TreeQA, a quantitative genome wide association (GWA) mapping algorithm. TreeQA utilizes local perfect phylogenies constructed in genomic regions exhibiting no evidence of historical recombination. By efficient algorithm design and implementation, TreeQA can efficiently conduct quantitative genom-wide association analysis and is more effective than the previous methods.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=95" target="_blank">FastMap</a></h2>
<p><img src="http://compgen.unc.edu/wp/wp-content/uploads/2009/08/fastmapicon.png" style="float:left;margin:20px;"></p>
<p>FastMap is a tool for genome wide association mapping that is designed for ‘Genetical Genomics’ studies using data from gene expression microarrays. It can accept both inbred mouse data, generally consisting of homozygous allele calls, and human SNP data, which includes heterozygous allele calls.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=228" target="_blank">Strain Sequence Identity Interval Viewer</a></h2>
<p><img src="images/pj_screenShot10-20-2008.jpg" style="float:left;margin:20px;"></p>
<p>Strain Sequence Identity (SSI) Interval Viewer is a web application that allows the user to choose a subset of mice strains from the list. A newer version of this tool, based on a different data representation is now available at <a href="http://compgen.unc.edu/SIIntervals/" target="_blank">http://compgen.unc.edu/SIIntervals/.</a></p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=251" target="_blank">Collaborative Cross Simulator</a></h2>
<p><img src="images/pj_sim_small.png" style="float:left;margin:20px;"></p>
<p>The Collaborative Cross Simulator will provide both data and visual simulations for the collaborative cross experiment. The simulator will provide a powerful tool for the community by allowing them to generate synthetic lines and populations. Using these synthetic mice, researchers can compare actual mouse data against statistically neutral and random data.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=264" target="_blank">SNP Data Retrieval and Filtering</a></h2>
<p><img src="images/pj_filterchromosome.jpg" style="float:left;margin:20px;"></p>
<p>This online tool allows you to retrieve and filter genetic data sets. You can specify the format and fields in the output file, the strains and chromosomes you want included, and a number of special filters to apply to the genetic data before it is returned. An automatic query interface is also available in addition to the graphical user interface which allows you to send queries and retrieve data automatically within a separate program.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=253" target="_blank">Genotype Sequence Segmentation</a></h2>
<p><img src="images/pj_gss.jpg" style="float:left;margin:20px;"></p>
<p>In this project, we study the problem of segmenting the genotype sequences into the minimum number of segments attributable to the founder sequences. Our algorithms incorporate biological constraints to greatly reduce the computation, and guarantee that only minimum segmentation solutions with comparable numbers of segments on both haplotypes of the genotype sequence are computed. Our algorithms can also work on noisy data including genotyping errors, point mutations, gene conversions, and missing values.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=256" target="_blank">Inferring Genome-wide Mosaic Structure</a></h2>
<p><img src="images/pj_minmosaic.jpg" style="float:left;margin:20px;"></p>
<p>In this project, we study the Minimum Mosaic Problem: given a set of genome sequences from individuals within a population, compute a mosaic structure containing the minimum number of breakpoints. This mosaic structure provides a good estimation of the minimum number of recombination events (and their location) required to generate the existing haplotypes in the population. We solve this problem by finding the shortest path in a directed graph. Our algorithm’s efficiency permits genome-wide analysis.</p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=275" target="_blank">FastANOVA: an Efficient Algorithm for Genome-Wide Association Study</a></h2>
<p><img src="images/pj_fastanova.jpg" style="float:left;margin:20px;"></p>
<p>In this project, we studied the problem of finding SNP-pairs that have significant associations with a given quantitative phenotype. We propose an efficient algorithm, FastANOVA, for performing ANOVA tests on SNP-pairs in a batch mode, which also supports large permutation test. FastANOVA only needs to perform the ANOVA test on a small number of candidate SNP-pairs without the risk of missing any significant ones. </p>
<p>
<div style="clear:both"></div>
<h2><a href="http://compgen.unc.edu/?page_id=537" target="_blank">Gene Expression Extract: Tool for extraction of subsets from gene expression data</a></h2>
<p><img src="http://compgen.unc.edu/wp/wp-content/uploads/2007/07/dendogram-150x150.png" alt="dendogram" title="dendogram" width="150" height="150" class="alignnone size-thumbnail wp-image-545" style="float:left;margin:20px;" /></p>
<p>This web tool allows one to extract subset of gene expression data by specifying subsets of genes,probes and strains. Clustering analysis can also be done on extracted data. An algorithm called, SAFE, is also integrated so that enrichment of biological pathways can be tested. The tool is available at <a href="http://compgen.unc.edu/GeneExprExtract">http://compgen.unc.edu/GeneExprExtract</a> </p>
<p>
<div style="clear:both"></div>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=10</wfw:commentRss>
		</item>
		<item>
		<title>Publications</title>
		<link>http://compgen.unc.edu/wp/?page_id=9</link>
		<comments>http://compgen.unc.edu/wp/?page_id=9#comments</comments>
		<pubDate>Thu, 12 Jul 2007 17:48:34 +0000</pubDate>
		<dc:creator>Administrator</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://compgen.unc.edu/?page_id=9</guid>
		<description><![CDATA[
A fast approximation to multidimensional scaling, by Tynia Yang, Jinze Liu, Leonard McMillan, and Wei Wang, Proceedings of the ECCV Workshop on Computation Intensive Methods for Computer Vision (CIMCV), 2006.
Poclustering: lossless clustering of dissimilarity data, by Jinze Liu, Qi Zhang, Wei Wang, Leonard McMillan, and Jan Prins, Proceedings of 2007 SIAM International Conference on Data [...]]]></description>
			<content:encoded><![CDATA[<ol>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2006/YangCIMCV06.pdf" title="Click to download PDF">A fast approximation to multidimensional scaling</a></em>, by Tynia Yang, Jinze Liu, Leonard McMillan, and <span>Wei Wang,</span> <strong>Proceedings of the ECCV Workshop on Computation Intensive Methods for Computer Vision (CIMCV)</em></strong>, 2006.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2007/08/sdm07_21.pdf" title="Click to download PDF">Poclustering: lossless clustering of dissimilarity data</a>, by Jinze Liu, Qi Zhang, Wei Wang, Leonard McMillan, and Jan Prins, <strong><em>Proceedings of 2007 SIAM International Conference on Data Mining (SDM2007)</em></strong>, 2007.</li>
<li><a target="_blank" href="http://compgen.unc.edu/wp/wp-content/uploads/2007/07/ismb2007.pdf" title="Click to download PDF">Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows</a>, by Adam Roberts, Leonard McMillan, Wei Wang, Joel Parker, Ivan Rusyn, and David Threadgill, <strong><em>Proceedings of the 15th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB)</em></strong>, 2007.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2007/YangNatureGenetics2007.pdf" title="Click to download PDF">On the subspecific origin of the laboratory mouse</a>, by Hyuna Yang, Timothy Bell, Gary Churchill, and Fernando Pardo-Manuel de Villena, <strong><em>Nature Genetics</em></strong> Jul 22, 2007.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2007/RobertsMamGenome2007.pdf" title="Click to download PDF">The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics</a>, by Adam Roberts, Fernando Pardo-Manuel de Villena, Wei Wang, Leonard McMillan, and David Threadgill, <strong><em>Mammalian Genome</em>, </strong>Aug 3, 2007.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2007/PanICDM07.pdf" title="Click to download PDF">Sample selection for maximal diversity</a>, by Feng Pan, Adam Roberts, Leonard McMillan, Fernando Pardo Manuel de Villena, David Threadgill, and Wei Wang, <em><strong>2007 IEEE International Conference on Data Mining (ICDM&#8217;07)</strong></em> </li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/JinMamGenome2008.pdf" title="Click to download PDF">An imputed genotype resource for the laboratory mouse</a>, by Jin P. Szatkiewicz, Glen L. Beane, Yueming Ding, Lucie Hutchins, Fernando Pardo Manuel de Villena, and Gary Churchill, <em><strong>Mammalian Genome</strong></em>, 19,3, 199-208.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/07/care_icde08.pdf">CARE: Finding Local Linear Correlations in High Dimensional Data</a>, by Xiang Zhang, Feng Pan, and Wei Wang, <a href="http://www.cs.unc.edu/~xiang/publications/CARE_ICDE08.pdf"><span style="color: windowtext; text-decoration: none; text-underline: none"></span></a><em><strong><span>Proceedings of</span> 2008 International Conference on Data Engineering (ICDE&#8217;08).</strong></em></li>
<li><a target="_blank" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/07/fp171-pan.pdf">CRD: Fast Co-clustering on Large Datasets Utilizing Smapling-Based Matrix Decomposition</a>, by Feng Pan, Xiang Zhang and Wei Wang, <em><strong>Proceedings of  2008 SIGMOD/PODS Conference</strong></em> <em><strong>(SIGMOD&#8217;08). </strong></em></li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/09/SIGKDD08.pdf">FastANOVA: an efficient algorithm for genome-wide association study</a>, by Xiang Zhang, Fei Zou, and Wei Wang. <em><strong>Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD&#8217;08)</strong></em>.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/09/vldb08.pdf">Mining non-redundant high order correlations in binary data</a>, by Xiang Zhang, Feng Pan, Wei Wang, and Andrew Nobel. <em><strong>Proceedings of the 34th International Conference on Very Large Data Bases (VLDB&#8217;08)</strong></em>.</li>
<li><a rel="http://compgen.unc.edu/wp/wp-content/uploads/2008/07/minseg-final.pdf" href="http://compgen.unc.edu/wp/wp-content/uploads/2008/07/minseg-final.pdf" title="Genotype Sequence Segmentation: Handling Constraints and Noise">Genotype Sequence Segmentation: Handling Constraints and Noise</a>, by Qi Zhang, Wei Wang, Leonard McMillan, Jan Prins, Fernando Pardo-Manuel de Villena, and David Threadgill, <strong><em>Proceedings of 8th Workshop on Algorithms in Bioinformatics (WABI&#8217;08)</em></strong>, 2008. </li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/09/cikm08.pdf">REDUS: finding reducible subspaces in high dimensional data</a>, by Xiang Zhang, Feng Pan, and Wei Wang. <em><strong>Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM&#8217;08).</strong></em></li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/09/treeqa.pdf">TreeQA: Quantitative Genome Wide Association Mapping Using Local Perfect Phylogeny Trees</a>, by Feng Pan, Leonard McMillan, Fernando Pardo-Manuel de Villena, David Threadgill and Wei Wang. <strong><em>Proceedings of the</em></strong> <strong><em>the 14th Pacific Symposium on Biocomputing (PSB&#8217; 09) .</em></strong></li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2009/03/psb09-qi-zhang-minmosaic.pdf">Inferring Genome-Wide Mosaic Structure</a>, by Qi Zhang, Wei Wang, Leonard McMillan, Fernando Pardo-Manuel de Villena, and David Threadgill. <strong><em>Proceedings of the</em></strong> <strong><em>the 14th Pacific Symposium on Biocomputing (PSB&#8217; 09) .</em></strong></li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/09/ws-procs9x61.pdf">FastChi: an efficient algorithm for analyzing gene-gene interactions</a>, by Xiang Zhang, Fei Zou, and Wei Wang. <strong><em>Proceedings of the</em></strong> <strong><em>the 14th Pacific Symposium on Biocomputing (PSB&#8217; 09) .</em></strong></li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2009/RECOMB09.pdf" title="Click to download PDF">COE: a general approach for efficient genome-wide two-locus epistasis test in disease association study</a>, by Xiang Zhang, Feng Pan, Yuying Xie, Fei Zou, and Wei Wang. <strong><em>Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology (RECOMB)</em></strong>, pp. 253-269, 2009.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2006/PS06.pdf">Structure-based function inference using protein family-specific fingerprints</a>, by Deepak Bandyopadhyay, Jun Huan, Jinze Liu, Jan Prins, Jack Snoeyink, Wei Wang, and Alexander Tropsha, <b><i>Protein Science</i></b>, v.15, 2006, p. 1537</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2007/DKE07.pdf">Benchmarking the effectiveness of sequential pattern mining methods</a>, by Hye-Chung Kum, J. H. Chang, and Wei Wang, <b><i>Data and Knowledge Engineering</i></b>, v.60, 2007, p. 30.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2006/DAMI05.pdf">Sequential pattern mining in multi-databases via multiple alignment</a>, by Hye-Chung Kum, Joong-Hyuk Chang, and Wei Wang, <b><i>Data Mining and Knowledge Discovery (DMKD)</i></b>, v.12, 2006, p. 151</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/SIGMOD08.pdf">CRD: fast co-clustering on large datasets utilizing sample-based matrix decomposition</a>, by Feng Pan, Xiang Zhang, and Wei Wang, <b><i>Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD)</i></b>, 2008, p. 173.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2007/ICDE07_2.pdf">Accelerating Profile Queries in Elevation Maps</a>, by Pan Feng, Wei Wang, and Leonard McMillan, <b><i>International Conference on Data Engineering (ICDE 2007)</i></b>, 2007.</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2008/ICDE08_2.pdf">Mining approximate order preserving clusters in the presence of noise</a>, by Mengsheng Zhang, Wei Wang, and Jinze Liu, <b><i>Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE)</i></b>, 2008, p. 160</li>
<li><a href="http://compgen.unc.edu/wp/wp-content/uploads/2009/SSDBM09.pdf">Split-order distance for clustering and classification hierarchies</a>, by Zhang, Q., Liu, E. Y., Sarkar, A., and Wang, W., <b><i>Proceedings of the 21st International Conference on Scientific and Statistical Database Management (SSDBM)</i></b>, 2009, p. 517.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://compgen.unc.edu/wp/?feed=rss2&amp;page_id=9</wfw:commentRss>
		</item>
	</channel>
</rss>
