Data Science: Beyond the hype of 'big data'

Joseph Konopelski
Joseph Konopelski, Dean of the Baskin School of Engineering (Photo by Carolyn Lagattuta)

These days it seems like everyone’s talking about "big data." If you google "big data in the news," you’ll find a surprisingly large number of articles published in the past few hours. I did. I stopped counting at 20.

What I found truly interesting, in addition to the sheer number of articles, was the almost comical variety of headlines. From "Six ways big data could damage your business" to "Six ways big data can make you a better chief marketing officer,” it’s easy to believe that big data really does have something (or six things) for everyone.

But what is big data? And why, beyond all the hype, is it important?

For starters, the term "big data" is now generally considered passé. I think that’s a good thing, because it's never been just about the size of the data. It’s about what insights can be drawn from the data once it’s been interpreted. Interpretation, however, is a challenge. It’s not only the scale that’s a problem, it’s the complexity as well. Data is messy. It comes in many different formats and fragments through many different sensors and sensing devices.

What we’re interested in at UC Santa Cruz is the science of data: how data can be gathered, structured, organized, managed, interpreted, and applied. And what makes this important is it gives us powerful new ways to address challenging problems. It allows us to ask new questions about complex systems, from the evolution of the universe to the most efficient routing of city buses. It will likely be an integral part of how we address challenges like Ebola and the distribution of basic resources, such as food and water.

The data science we at UCSC are particularly interested in is focused on quality research and education, grounded in ethics and social responsibility—data science for social good. To do this, we need to build deeper and more robust computational and mathematical foundations. We need methods that alert users to potential bias and help users make the most informed decisions possible. And we need to train a generation of responsible data scientists.

UC Santa Cruz is an established leader in the data science revolution, driven for decades by our leadership in astronomy, genomics, and other data-intense fields. We’re known for developing new computational, algorithmic, and analytic theory. With this existing expertise, and with a growing data science research portfolio and curriculum, we’ll continue to train future data scientists in the thoughtful collection, management, and use of data.

Our goal is to model a future that respects issues of privacy, security, and the powers and limitations of data. Researchers and advocates on the frontlines of environmental protection, education, health care, and political action are being deluged by newly available data, including socio-behavioral data and its impact on things like sustainability and civic engagement. They will benefit from working with data scientists with strong skill sets and a shared commitment to the greater good. Businesses—constantly striving to develop better products and services, and to improve customers’ experience—will also benefit from strong foundational methods and responsible data scientists.

We invite you to get involved! Read more about the Data Science Leadership Initiative of The Campaign for UC Santa Cruz. And consider attending an event, including the Data Science Afternoon on Feb. 27 or Spring Data Science Day on May 21 (details coming soon).