Global Alliance for Genomics and Health unveils new genomics interface

Alliance's new Version 0.5 of Genomics API allows for seamless sharing of genetic data

David Haussler
David Haussler directs the UC Santa Cruz Genomics Institute. (Photo by R. R. Jones)

The Global Alliance for Genomics and Health has released a new Application Programming Interface (API) developed by the Global Alliance's Data Working Group that will allow DNA data providers and consumers to better share information and work together on a global scale.

This new open-source Genomics API, referred to as Version 0.5, is a standard, open tool promoting data interoperability. It will be part of a suite of Genomics APIs being developed by the Global Alliance. The API enables the interoperable exchange of information contained in DNA sequence reads across multiple organizations and on multiple platforms.

David Haussler, professor of biomolecular engineering at UC Santa Cruz, is co-chair of the Global Alliance's Data Working Group and a cofounder of the Alliance. The new Genomics API is one of the first products to be developed and distributed by the Global Alliance for Genomics and Health, which was formed only one year ago and is made up of over 200 of the world's leading biomedical research institutions, healthcare providers, research funders, information technology and life science companies, and disease and patient advocacy organizations.

"This new Genomics API is an exciting step toward interoperability in genomic data. It advances the Global Alliance's mission of enabling the sharing of genomic and clinical data to improve human health," said Haussler, who is scientific director of the UC Santa Cruz Genomics Institute. "Because this new API lets researchers work consistently with genomic data across institutions and platforms, it will help realize the benefits that come from large-scale genomic data sharing, allowing us to find the needle in the haystack for patients with rare diseases."

Transparency and collaboration

Promoting the Global Alliance's goals of transparency and collaboration, this new Genomics API Version 0.5 uses an open development process to allow the wider bioinformatics community to participate. While the Data Working Group has a core team of active developers, all interested developers from any institution can further engage with this platform by exploring sample apps, building implementations from scratch or from existing samples, or by providing feedback on the API and its documentation. The interface is managed in an open Global Alliance developer site at ga4gh.org.

The newly announced Genomics API Version 0.5 builds off of the successful Version 0.1, which was also developed by members of the Data Working Group and is in use by leading organizations, including the European Bioinformatics Institute (EMBL-EBI), the U.S. National Center for Biotechnology Information (NCBI), Google, Genome Savant, and Harvard Medical School's Biomedical Cybernetics Laboratory, powering a growing community of applications. As analysis tools adopt the new API, researchers will be able to extend their own infrastructure to utilize cloud resources, such as those available from Amazon Web Services, Google Cloud Platform, and Microsoft Azure.

The Global Alliance's Genomics API is built on the file formats developed over the last five years for large-scale genomic sequencing projects, now also managed by the Global Alliance, but features cleaner models, with a modern, easy-to-use data description schema and a web-enabled interface.

"Modern DNA sequencing, when coupled with modern data and cloud technology, can lead to breakthroughs in understanding and improving human health. This new Genomics API is a big step forward," said David Glazer, co-chair of the Reads Task Team and engineering director for Google Cloud Platform and Google Genomics. "Google already supports Version 0.1 of the API, and we'll be adding support for Version 0.5 soon, as well as continuing to contribute to the Data Working Group."

Breaking new ground

"The Global Alliance is breaking new ground in combining genomic sequencing and clinical care. Amazon Web Services is proud to support these efforts, and help in defining new operating models, such as the latest Genomics API," said Matt Wood, general manager of data science for Amazon Web Services. "We view these new APIs as a vital component for collaboration and development of next-generation tools that can run cost-effectively at massive scale."

"Genome sequencing is transitioning from being a powerful research tool to making an enormous impact in clinical diagnostics and care," said Dr. Richard Durbin, acting head of computational genomics at the Wellcome Trust Sanger Institute and leader of the Genome Informatics group. "This API from the Global Alliance Data Working Group will enable genomic data processing to move beyond research file formats into modern computing and data architectures, facilitating controlled data sharing and the effective use of these new technologies for both clinical and research benefit."

Other Working Groups of the Global Alliance for Genomics and Health are currently identifying best practices to integrate genomic data into clinical practice, reaching agreement on security protocols, and developing a framework to address ethics and regulatory considerations.

About the Global Alliance

The Global Alliance for Genomics and Health is an international, non-profit alliance formed to help accelerate the potential of genomic medicine to advance human health. Bringing together over 200 leading institutions working in healthcare, research, disease and patient advocacy, life science, and information technology, partners in the Global Alliance are working together to create a common framework of standards and harmonized approaches to enable the responsible, voluntary, and secure sharing of genomic and clinical data. Learn more at genomicsandhealth.org.