An international community of researchers is coalescing around the Human Cell Atlas Initiative, an ambitious plan to map and characterize every cell type in the human body.
In a significant step toward this goal, the UC Santa Cruz Genomics Institute will collaborate with the Broad Institute of MIT and Harvard and the European Bioinformatics Institute (EMBL-EBI) to build a data coordination platform for the Human Cell Atlas. The Chan Zuckerberg Initiative (CZI), which is funding the project, announced the collaboration June 1 at a Human Cell Atlas meeting in Stockholm, Sweden.
Creating a comprehensive reference map of all human cells could transform our understanding of human health and disease, with potential impacts on almost every aspect of biomedical research.
"I'm very excited about the initiative and the people who are involved," said Jim Kent, research scientist at the UCSC Genomics Institute. "The science is really fundamental, and having a Human Cell Atlas will make work in almost every biomedical research lab go faster, much like the human genome sequence did. Once we had the genome, it became an incredibly useful framework for layering more information, because so much biology can be mapped to the genome. I think something like that will happen with the Human Cell Atlas too. It's sort of the missing link between the genome and the body."
It is a massive undertaking, however, that will require the combined efforts of a large community of researchers with diverse expertise. At the UCSC Genomics Institute, work on the data coordination platform is being led by Kent and research scientist Benedict Paten.
"The Human Cell Atlas is not only a fascinating and important biology project, it's also a very large computational and engineering project that is leading the way in terms of how to organize big data," Paten said. "This collaboration is bringing together a stellar engineering team to create a data commons that is maximally accessible and usable for a broad base of scientists."
The groups working on the data coordination platform are all distinguished organizations with demonstrated success in genomics, informatics, and data sharing. They have been instrumental in the success of several large international genomics data projects, including the Human Genome Project, and they have built widely used software libraries and data resources, including the UC Santa Cruz Genome Browser, Broad’s FireCloud platform, and EMBL-EBI's infrastructure for the analysis, sharing, and long-term preservation of molecular data.
The Human Cell Atlas Initiative is the brainchild of Sarah Teichmann at the Wellcome Trust Sanger Institute and Aviv Regev of the Broad Institute, who organized an international meeting to discuss the initiative in October 2016 in London. The Stockholm conference is the third Human Cell Atlas meeting.
The human body is made of trillions of cells, the most fundamental units of life. Different types of cells (neurons, blood cells, skin cells, etc.) express different sets of genes and make up the various tissues and organs of the body. A Human Cell Atlas would catalogue all cell types and sub-types in the human body, map their locations, identify the genes they express, and potentially much more.
Only recently have tools such as single-cell genomics made it realistic to envision such an atlas, which could reveal a new view of human biology and new opportunities for diagnosing, monitoring, and treating disease. By making the Human Cell Atlas available freely to scientists all over the world, the initiative aims to provide a transformative resource for the global biomedical research community.
The data coordination platform for the Human Cell Atlas will enable many researchers to easily submit large quantities of genomic and imaging data, perform robust and scalable analyses and quality-control checks on those data, and make both the data and the results of the analyses available widely and openly, maximizing the opportunity for downstream innovation.
In addition to providing financial support, CZI will also collaborate on aspects of the software engineering and data infrastructure operations. According to Paten, the tools developed for the data coordination platform will have value well beyond the scope of the Human Cell Atlas. The project will help demonstrate a new model for large-scale collaboration, bringing together biologists and software engineers across multiple research institutions to build a common platform and demonstrating the value of using open-source, cloud-based technologies to build the data infrastructure for global scientific collaborations.
At the UCSC Genomics Institute, technical director Brian O'Connor will join Paten and Kent in leading the institute's work on the data coordination platform. Anthony Philippakis will lead the effort at the Broad Institute, and John Marioni will lead EMBL-EBI's work on the project.