Collaborative Cross

The 8 strains of the Collaborative Cross
Figure 1. Using only eight lab mouse strains the Collaborative Cross (CC) is poised to revolutionize our understanding of system-wide genetic interactions enabling a new discipline called “Systems Genetics”.

Status

The Collaborative Cross is a large panel of new inbred mouse strains currently being developed through a community effort (Churchill et al. 2004). The CC addresses many shortcoming in available mouse strain resources, including small numbers of strains, limited genetic diversity, and a non-ideal population structure. The CC strains are derived from an eight way cross using a set of founder strains that include three wild-derived strains. You can monitor the ongoing development of the Collaborative Cross and pre-order mice at the following URL: http://csbio.unc.edu/CCstatus.

Motivation

Genetics plays a significant role in most common diseases, affecting both susceptibility and treatment success. The classical experimental approach for establishing genetic links involves studying biological systems one component at a time. However, such reductionist approaches do not scale when faced with the complexity of ~30,000 interacting genes as in humans and other mammals. This limitation is further compounded by environmental interactions. Factorial experimentation in which many or all components of a system are altered simultaneously through randomization is far more efficient, as described in the 1930’s by Sir Ronald Fisher. Considering all genes concurrently is essential in order to build predictive models of traits or disease with complex etiologies.

An example breeding funnel used to generate each new CC strain
Figure 2. Derivation of the CC. A unique funnel breeding scheme will be used to derive each CC strain. This breeding approach is designed to randomize the genetic makeup of each inbred line. Each of the eight parental strains occupies each A-H position an equal number of times.

What is the Collaborative Cross?

Realizing that a new model population was needed to understand complex traits and diseases with complex etiologies, the Collaborative Cross (CC) was proposed (CTC 2004, Threadgill 2002). The CC provides a translational tool to integrate gene functional studies, genetic networks, and variation maps of the biomolecular space (including all the biomolecules between the primary DNA sequence, transcripts, proteins, and metabolites). The CC is a large, panel of 1,000 “engineered” inbred (RI) mice strains (Fig. 2), which can be converted to over 1,000,000 potential ‘outbred’ but completely reproducible genomes through the generation of recombinant inbred intercrosses (RIX) (Zou 2005) (Fig. 3). The CC RI strains and their derivative CC RIX have a population structure that randomizes existing genetic variation, which will provide unparalleled power to assign causality, and to understand the intricacies of the biological networks underlying diseases. Moreover, this genetic variation is similar to human populations.

The CC combines the genomes of eight genetically diverse founder strains – A/J, C57BL/6J, 129S1/SvImJ, NOD/LtJ, NZO/HlLtJ, CAST/EiJ, PWK/PhJ, and WSB/EiJ – to capture nearly 90% of the known variation present in laboratory mice– compare this to AXB/BXA and BXD, the two most commonly used mouse RI panels, which capture only 13% of the known variation. A combinatorial breeding design has been developed that yields genetically independent incipient CC lines and that ensures balanced contributions of sex chromosomes and mitochondria. This large, common set of genetically defined mice becomes a common denominator allowing for cumulative and integrated data collection, giving rise to the detection of networks of functionally important relationships among diverse sets of biological and physiological phenotypes and a new view of the mammalian organism as a whole and interconnected system. The beauty of the CC is its extensibility, providing a framework for cumulative data integration over space (different labs) and time (longitudinal or future analyzes) and at all levels – from molecules, to cells, physiological systems, and environments.

Recombinant Inbred Crosses
Figure 3. Modeling human genetic diversity using Recombinant Inbred Crosses. There many advantages to performing association studies on RI lines. Since all strains are isogenic, genotyping needs only to be preformed once. The mice are also easily reproducible allowing for repeated testing, thus, increasing the statistical power. A disadvantage of RI lines is that they are a poor model of human populations due to their loss of heterozygousity. The CC overcomes this problem by supporting a virtual population of F1 RI intercrosses, known as the RIX panel. There are nearly 1 Million potential RIX combinations, as illustrated in the blue boxes below (the pink boxes represent the standard CC RI lines).

The CC is an ideal test bed for predictive, or more accurately, probabilistic biology supporting the development of analytic models for whole-organism biological predictions, which will be required for personalized medicine to become a reality. Since the CC is a genetic reference population that can be reproduced ad infinitum, it supports a model of community collaboration fostering the generation of a comprehensive, ever-expanding compendium of molecular and physiological data. By providing a large, common set of genetically defined mice, the CC will be a focal point for communal biological study and dissemination; the collection of genetic and physiological phenotypes will provide the ultimate platform for examining a mammalian organism as a whole and interconnected system. These extensive datasets provide an ideal test bed for in silico predictive biology.

The production of the CC is well underway. Breeding began in May 2005 at the Oak Ridge National Laboratory (ORNL). This effort was initiated with support from The Ellison Medical Foundation with ongoing support from the Department of Energy, Office of Biological and Environmental Research and the National Institutes of Health. A parallel effort is also underway at Tel Aviv University with support from the Wellcome Trust. Initial CC mouse strains are currently available from both ORNL and Tel Aviv.

The CC is more than just a biological resource. Use of it will require the development of new experimental paradigms with an unprecedented dependence on computational tools. Even traditional tasks, such as phenotyping and association mapping must be rearchitected to handle the extraordinary volume of data that will be collected. Effective experimental scheduling must be developed to maximize the information derived at every stage of experimentation. New computational tools will also be required for analysis, visualization, and dissemination.

References

CTC (2004), Churchill, G. A., et al.. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat Genet. 36, 1133-7.

Threadgill, D. W., et al., 2002. Genetic dissection of complex and quantitative traits: from fantasy to reality via a community effort. Mamm Genome. 13, 175-8.

Zou, F., J. Gelfond, D. Airey, L. Lu, K. Manly, R. Williams, & D. Threadgill, 2005 Quantitative trait locus analysis using recombinant inbred intercrosses (RIX): theoretical and empirical considerations. Genetics 170:1299–1311.