Unprecedented study yields most comprehensive map of cancer genomes to date

Pan-Cancer Project discovers causes of previously unexplained cancers, pin-points cancer-causing events, and zeros in on mechanisms of development

Josh Stuart
Angela Brooks
Results of the Pan-Cancer Project were published in 23 papers in Nature and affiliated journals.

An international team including researchers at the UC Santa Cruz Genomics Institute has completed the most comprehensive study of whole cancer genomes to date, significantly improving our fundamental understanding of cancer and suggesting new directions for its diagnosis and treatment.

Involving more than 1,300 scientists and clinicians from 37 countries, the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (known as PCAWG or the Pan-Cancer Project) analyzed more than 2,600 genomes of 38 different tumor types, creating a huge resource of primary cancer genomes. This was then the launch-point for 16 working groups studying multiple aspects of cancer’s development, causation, progression, and classification.

Previous studies focused on the 1 percent of the genome that codes for proteins. The Pan-Cancer Project explored in considerably greater detail the remaining 99 percent of the genome, including key regions that control switching genes on and off. By analogy, if the genome can be viewed as a recipe book for living cells and cancer as a process that makes changes to it, both large and small, then previous efforts looked for changes in the ingredient lists, while this new project also looks for alterations to the instructions for how those ingredients are used.

“This study, which provides the most complete picture to date of cancer-causing mutations in all parts of the genome, was a massive team science effort involving researchers spanning the globe,” said steering committee member Josh Stuart, the Baskin Professor of Biomolecular Engineering at UC Santa Cruz. “At UC Santa Cruz, our strengths in systems biology and RNA expression helped us connect findings in the previously unexplored noncoding genome with the pathways that lead to cancer. Like a charted map, this new work creates a reference and resource that researchers can use to interpret future data and physicians can use to guide treatment.”

In addition to serving on the project’s steering committee, Stuart led a working group that focused on the biological networks and pathways affected by the genetic changes in cancer cells. Other scientists affiliated with the UCSC Genomics Institute also made important contributions to the project, including Angela Brooks, assistant professor of biomolecular engineering, Jingchun Zhu, an associate research scientist at the Genomics Institute, and graduate students David Haan and Cameron Soulette.

Gene expression

Brooks co-led a working group that focused on changes in gene expression revealed by sequencing the RNA molecules in cancer cells in addition to the DNA sequences that are the focus of most cancer genomics studies.

“RNA is the output of the genome, so RNA sequencing can help us interpret the DNA mutations identified in the whole genome sequences and gives us a more complete picture of how cancer genomes are altered,” Brooks explained. “We found that a lot of known cancer genes have alterations at the RNA level that we wouldn’t pick up just from the DNA sequence data.”

The Pan-Cancer Project has made available a comprehensive resource for cancer genomics research, including the raw genome sequencing data, software for cancer genome analysis, and multiple interactive websites exploring various aspects of the PCAWG data.

The project extended and advanced methods for analyzing cancer genomes, including cloud computing, and by applying these methods to its large dataset, discovered new knowledge about cancer biology and confirmed important findings of previous studies.

23 papers

In 23 papers published February 5 in Nature and its affiliated journals, the Pan-Cancer Project reported that:

  • The cancer genome is finite and knowable, but enormously complicated. By combining sequencing of the whole cancer genome with a suite of analysis tools, scientists can characterize every genetic change found in a cancer, all the processes that have generated those mutations, and even the order of key events during a cancer’s life history.
  • Researchers are close to cataloging all of the biological pathways involved in cancer and having a fuller picture of their actions in the genome. At least one causal mutation was found in virtually all of the cancers analyzed, and the processes that generate mutations were found to be hugely diverse, from changes in single DNA letters to the reorganization of whole chromosomes. More than a dozen regions of the genome controlling how genes switch on and off were identified as targets of cancer-causing mutations.
  • A new method can identify mutations which occurred years, sometimes even decades, before the tumor appeared. This opens, theoretically, a window of opportunity for early cancer detection.
  • Tumor types can be identified accurately according to the patterns of genetic changes seen throughout the genome, potentially aiding the diagnosis of a patient’s cancer where conventional clinical tests could not identify its type. Knowledge of the exact tumor type can help doctors tailor treatments.

“The findings we have shared with the world today are the culmination of an unparalleled, decade-long collaboration that explored the entire cancer genome,” said Dr. Lincoln Stein, a member of the project steering committee and head of adaptive oncology at the Ontario Institute for Cancer Research (OICR). “With the knowledge we have gained about the origins and evolution of tumors, we can develop new tools and therapies to detect cancer earlier, develop more targeted therapies and treat patients more successfully.”

By analyzing the molecular aberrations in cancer cells across a broad range of tumor types, pan-cancer analysis is a powerful tool for identifying common pathways that lead to cancer. Stuart has been a pioneer of this approach, helping to organize the Pan-Cancer Initiative of The Cancer Genome Atlas (TCGA) program of the U.S. National Institutes of Health (NIH). The Pan-Cancer Analysis of Whole Genomes expanded this approach to include non-coding regions of the genome, using whole genome data from TCGA and the International Cancer Genome Consortium (ICGC).

The results provide new insights into the molecular drivers of cancer and an enormously valuable set of resources for the scientific community to advance cancer research.

“I’m very excited to see the results of this work come out, because it was such a grand challenge to come together as a global scientific community and put together these findings into a coherent story,” Stuart said.

All of the papers and related materials are available online at www.nature.com/collections/pcawg.