Mouse genome sequence published with first comparative analysis of mouse and human genomes

Researchers in the Center for Biomolecular Science and Engineering (CBSE) at the University of California, Santa Cruz, made significant contributions to the analysis of the mouse genome sequence announced this week by the international Mouse Genome Sequencing Consortium. The consortium published a high-quality draft sequence of the mouse genome--the genetic blueprint of a mouse--together with a comparative analysis of the mouse and human genomes describing insights gleaned from the two sequences. The paper appeared in the December 5 issue of the journal Nature.

The achievement represents a landmark advance for the Human Genome Project. It is the first time that scientists have compared the contents of the human genome with that of another mammal. This milestone is all the more significant given that the laboratory mouse is the most important animal model in biomedical research.

David Haussler, professor of computer science and director of the CBSE, and CBSE research scientist Jim Kent both worked on the analysis of the mouse and human genomes and are coauthors on the Nature paper. Their previous contributions to the assembly and analysis of the human genome sequence, which was published last year, earned them widespread acclaim. Other members of the UCSC genome bioinformatics group who are involved in the mouse project and listed as coauthors on the paper include graduate students Ryan Weber, Krishna Roskin, Mark Diekhans, and Robert Baertsch; postdoctoral researcher Terrence Furey; software project manager Donna Karolchik; and software developers Angie Hinrichs and Matt Schwartz.

Kent has created a web-based mouse genome browser similar to the browser he created for the human genome. The browser allows scientists around the world to download data and detailed, customized analyses of the genome. The mouse and human genome browsers and other useful tools for researchers are available on the UCSC genome bioinformatics site.

The mouse sequence provides scientists a powerful research tool to extract meaning from the human genome sequence. It allows them to recognize functionally important regions in the human genome by virtue of the fact that they are conserved through the 75 million years of evolution separating humans and mice.

"Regions of DNA in genomes evolve under Darwinian selection when they perform a useful function, or under random neutral drift when they do not," Haussler said. "Comparing the human genome to the mouse genome, we can see the statistical evidence for selection in some regions and study the properties of the neutral drift in others."

About 95 percent of the human genome looks like it is evolving under neutral drift, Haussler said. Surprisingly, the rate of neutral drift varies regionally along the chromosomes, with some regions drifting faster than others. The researchers were not able to identify any features of the DNA that would account for these variations in mutation rates.

About 5 percent of the genome contains sections of DNA that are conserved between human and mouse. Because these DNA sequences have been preserved by evolution over tens of millions of years, scientists infer that they are functionally important and under some evolutionary selection.

Because the mouse carries virtually the same set of genes as the human but can be used in laboratory research, this information will allow scientists to experimentally test and learn more about the function of human genes, leading to better understanding of human disease and improved treatments and cures.

The mouse genome sequence shows the order of the DNA chemical bases A, T, C, and G along the 20 chromosomes of a female mouse of the "Black 6" strain--the most commonly used mouse in biomedical research. It includes more than 96 percent of the mouse genome with long, continuous stretches of DNA sequence and represents a sevenfold coverage of the genome. This means that the location of every base, or DNA letter, in the mouse genome was determined an average of seven times, a frequency that ensures a high degree of accuracy.

Earlier this year, the mouse consortium announced that it had assembled the draft sequence of the mouse and deposited it into public databases. The consortium's paper last week reports the initial description and analysis of this text and the first global look at the similarities and differences in the genomic landscapes of the human and mouse.

The draft sequence of the mouse genome was assembled by the Mouse Genome Sequencing Consortium, an international team of scientists at the Whitehead Institute in Cambridge, Mass., Washington University in St. Louis, and the Wellcome Trust Sanger Institute and the European Bioinformatics Institute, in Hinxton, England. The project was funded in part by the National Human Genome Research Institute of the National Institutes of Health and the Wellcome Trust in the U.K.

The sequencing centers were joined in the analysis effort by scientists from 27 institutions in six countries. These include Haussler's group at UCSC, as well as computational biologists at Pennsylvania State University, Oxford University in England, the Institute for Systems Biology in Seattle, Washington University, and the Universitat Pompeu Fabra in Barcelona, Spain, among others.

The sequence information from the mouse consortium has been immediately and freely released to the world, without restrictions on its use or redistribution. The information is scanned daily by scientists in academia and industry, as well as by commercial database companies, providing key information services to biotechnologists.

The work reported in this paper will serve as a basis for research and discovery in the coming decades. Such research will have profound long-term consequences for medicine. It will help elucidate the underlying molecular mechanisms of disease. This in turn will allow researchers to design better drugs and therapies for many illnesses.

"Publishing the sequence in 2001 of the first mammalian genome--our own--was a remarkable and historical achievement. To sequence another mammalian genome in less than two years and to discover the treasure trove of information one can derive from a comparison of the two is beyond nearly anyone's dreams. It constitutes a tremendously exciting and defining moment for biomedical research," said Francis Collins, director of the National Human Genome Research Institute.

"This is an extraordinary milestone. For the first time we have an opportunity to see ourselves in an evolutionary mirror," said Eric Lander, director of the Whitehead/MIT Center for Genome Research. "The mouse genome represents a very important chapter in evolution's lab notebook. Being able to read this notebook and compare genomic information across species allows us to glean important information about ourselves."

Additional details of the analysis of the mouse and human genomes are described on the Whitehead Center's web site.

All of the results from this analysis can be found at the UCSC genome bioinformatics web site, as well as at sites maintained by the National Center for Biotechnology Information at the National Library of Medicine and by the European Bioinformatics Institute.

The project is funded by grants from government agencies and public charities in the various countries. These include the National Human Genome Research Institute at the U.S. National Institutes of Health, the Wellcome Trust in England, and the U.S. Department of Energy, as well as agencies in Japan, France, Germany, and China.