More and more fields of scientific research are generating large volumes of data in pursuit of scientific questions. Each year, the Large Hadron Collider at CERN generates 25 petabytes of data. Storing all this data would require approximately 400,000 of the highest capacity smartphones — and that is just one year. The NIH recently announced the Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative that proposes to capture the activity of the human brain's 100 billion neurons. This research proposal and many others like it have led to practical questions about data access and storage. The most cumbersome question of all is who pays for this storage.
In a recent Science editorial, Francine Berman of Rensselaer Polytechnic Institute and Vinton Cerf of Google argue for a shift in scientific culture to an acceptance of shared data costs. They propose that private companies, academic, and corporate labs invest in computer data centers and storage systems.
In 2008, Google said they would host public scientific data, but shut the project down by the end of the year for business reasons. The example of Google underscores the difficulty for any one sector to find sufficient resources and incentives to entirely support research data. A more viable option may be developing partnerships across sectors.
Berman is the chairwoman of the U.S. branch of the Data Research Alliance, an organization of researchers from academic, corporate, and government sectors who are working to develop new systems to store and access data. Berman and Cerf provide recommendations to address the problem of data storage and maintenance, but conclude that the scientific community must embrace paying for digital data as they do for other forms of digital media. For example, paying for downloads and subscriptions to access digital audio and video files through iTunes and Amazon, among other providers.
More debate on the topic of access to data is likely coming soon. In September, federal agencies must submit proposals for allowing public access to articles describing federally funded research within one year of publication. As an added challenge, these proposals must use existing agency budgets. These proposals were requested in a memorandum from the White House Office of Science and Technology Policy.