In a world where paper files are almost a thing of the past, how do you know your medical records are accurate? Will that photo you took four years ago still be there? How can you be protected from hackers who could attack your bank and destroy your account data?
Those are just some of the questions on the minds of researchers at the Center for Research in Storage Systems (CRSS), a new Industry/University Cooperative Research Center at UC Santa Cruz, supported by the National Science Foundation (NSF).
"Everything is being stored digitally rather than analog," said Professor Ethan Miller, director of the CRSS. "There's an astounding amount of data that's being created every day. How do I know it's safe and secure? How do I know the data I stored is exactly the data I'm getting back? How can I find what I want?"
It's those issues of security, manageability, and safety that are growing more important— not only to individuals but also to government and business.
The federal government is dealing with those problems by creating the Utah Data Center, a data storage facility for the intelligence community. But companies are turning to places like the CRSS for help with their long-term storage problems. Facebook and Google want to be able to process consumer data, for instance, but they need to be able to store it first.
Leadership in Data Science is a signature initiative in The Campaign for UC Santa Cruz. The university is poised to significantly shape this foundational science of the future. Many of the nation's biggest high-tech companies are already partnering with UC Santa Cruz in storage systems research. And faculty in UCSC's Baskin School of Engineering, such as Darrell Long, Malavalli Professor of Storage Systems Research, have longstanding relationships with the information storage industry.
"We're close to Silicon Valley and have good, tight connections. We meet with industry leaders regularly," Miller said.
Just how much data is out there? Corporations are now regularly dealing with petabytes of information. One petabyte equals 1,000 terabytes; one terabyte equals 1,000 gigabytes.
Miller compared the data storage problem to a medical disease that has no known cure.
"Data storage is going to continue being a big issue," Miller said. "The problem will change over time, but I don't think it's something that will be solved in 20 years." Companies and individuals will always want to store more data and do more with it.
Current industry leaders who are members of UCSC's Industry/University Cooperative Research Centers include IBM, EMC, NetApp, Hitachi, Samsung, SanDisk, HP, LSI, Huawei, Permabit, and, most recently, Intel.
Each company pays $50,000 a year for royalty-free patents, said Andy Hospodor, executive director of the CRSS. Members attend semi-annual meetings with faculty and students, and have an opportunity to collaborate with CRSS. Typically, industry-student relationships lead to multiple job offers for M.S. and Ph.D. graduates.
At this point, the leaders of big-name companies are seeking out the brainpower of UC Santa Cruz.
"When I met with the dean a couple of years ago he asked, 'How will I know when this program is successful?'" Hospodor said. "I told him when I no longer have to recruit industry leaders. We're at that point now."
UCSC faculty also partner with leaders at other institutions of higher learning, including Harvard and Santa Clara University.
The data storage field will be a big employer in the future, as government agencies and private corporations try to grapple with data storage questions.
"It's an important problem for society, not just computer scientists," Miller said.