Marine scientists explore the future of open data science

Researchers make recommendations for how to move forward in a world of near-limitless data

Alexa Fredston, an assistant professor of ocean sciences at the University of California, Santa Cruz, uses large data sets and models to understand human impacts on the oceans. (Photo by Britt Lichty)

Around the world, there’s an increased interest in making research results more accessible to the public. The Biden-Harris administration, NASA, NOAA and several other US Government agencies declared 2023 the Year of Open Science. Alexa Fredston, an assistant professor of ocean sciences at the University of California, Santa Cruz and Julia Stewart Lowndes, a marine data scientist at the National Center for Ecological Analysis and Synthesis of the University of California, Santa Barbara, both consider open science to be a vital part of their academic careers and a key way to make the field of marine science more transparent and inclusive. They recently published an article in the journal Annual Review of Marine Science about improving open data science practices within marine science. 

“I’m a marine ecologist, but I write code all day,” said Fredston. She uses large data sets and models to understand human impacts on the oceans. Like many other researchers trying to answer large-scale questions, Fredston spends much of her time deciding which data sets to use and how to process them. The amount of data available through open science is exciting, she says, but it can also become overwhelming and requires new skills.

“Twenty or 30 years ago, if you were doing a PhD project on glacier ice, you would probably go to a glacier, take some measurements, write that up, and that would be your dissertation,” said Fredston. “Now, you'll still go to the glacier and take some measurements, but then you're expected to put that in the context of global climate models and historical precipitation. And all of a sudden, you're pulling climate data from a server somewhere in one format. You're dealing with data on precipitation or sea surface temperature in a different format, and that's really where we think a lot of people struggle.” To encourage high-quality science that takes advantage of and contributes to open science, Fredston and Stewart Lowndes make several suggestions about where to start.

They recommend that expert working groups create guides for choosing appropriate datasets. They encourage paid time for learning new skills, including coding. They also suggest that academic incentives should change. Institutions tend to reward journal articles over other types of publications and contributions to science, such as software packages and databases.

“The way that academic tenure and promotion happens has been revised many times in academic history, and there's no reason why it couldn't be broad enough to recognize those kinds of contributions,” said Fredston.

The researchers also point to an increase in open-access articles and pre-prints that eventually link to the final, peer-reviewed version. For-profit journals are having to shift their models towards open access to accommodate recent mandates in Europe and the United States that state and taxpayer-funded research must be published open access within the next few years.

But to fully take advantage of the huge amounts of open science data available, the researchers state that journals should also rethink their emphasis on novelty. Currently, they favor papers with unprecedented results.

“There's nothing wrong with celebrating new exciting results as they come out,” said Fredston. “But we don't have a great system for rewarding going back to papers and saying, ‘Was this result really accurate? If we do it with 100 times as much data, does it still hold up?’ Often, the answer is no.”

The researchers also address where the data that researchers use comes from. Popular apps like iNaturalist and eBird provide huge amounts of reliable information from the public about species distribution and human impacts. These apps have been slower coming to the oceans since people don’t spend as much time there, Fredston says. But the public is beginning to use them in marine environments more often. 

In addition to better science, the researchers say open science practices have the potential to create a more inclusive environment that supports diversity and collaboration. Fredston is considering these points as she starts her own lab at UC Santa Cruz. She says she will focus on helping students build data science skills and learn to code in an approachable way.

“I don’t ever want them to feel like one person alone in an office,” she said. “What excites me the most about the future of open science is the ability to work on really big, interdisciplinary, collaborative teams.”