Science Editor-in-Chief Bruce Alberts said that “every scientist should be trained to be highly suspicious about his or her own results,” and he called for uniformly high data-sharing standards during a 5 March Capitol Hill event on scientific integrity and transparency.
Alberts testified as part of an expert panel convened by the U.S. House of Representatives Subcommittee on Research, following the February release of guidelines by the Office of Science and Technology Policy (OSTP) concerning access to scientific research data. Those guidelines call on key federal agencies to develop plans for increasing public access to taxpayer-funded research results. The document also requires agencies to ensure that federally supported research articles are made freely available to the public twelve months after publication, as the journal Science already is doing.
Science Editor-in-Chief Bruce Alberts (left) speaks with Rep. Daniel Lipinski (D-Illinois) (right) after the hearing. | Credit: AAAS
Alberts described the journal’s efforts to make data more readily accessible to scientists seeking to replicate or refute published findings. He also responded to general concerns about scholarly papers that are found to be in error.
The journal has long required that “all data necessary to understand, access, and extend the conclusions of the manuscript must be available to any reader of Science,”Alberts explained. Science also recently strengthened its rules regarding access to computer codes needed to understand published results, and it now requires all senior authors to “sign off” on each paper’s primary conclusions. To comply with such rules, Alberts said, scientists must be assured of long-term federal support for critical research databases. Scientists who want to assess published findings also need appropriate tools for working with data, he added.
Alberts then addressed concerns related to scientific integrity. Like others at the hearing, he cited 2011 correspondence in Nature by Florian Prinz and colleagues that questioned the reliability of published data on potential drug targets. “My conclusion,” Alberts said, “is that the standards are lower in some subfields of science than others, and we need to work on setting higher standards.” He also urged individual scientists to more critically assess their own work. “It’s easy to get a result that looks right when it’s really wrong. One can easily be fooled. Every scientist must be trained to be highly suspicious about his or her results.”
In his written testimony, Alberts acknowledged that “Science has on occasion been fooled into publishing articles that contain data that was fabricated by one or more of the authors.” But cases of honest errors are far more common than deliberate fraud, he said, and in either case, papers must be retracted either by authors or journal editors. He noted that Science published a special issue on “Data Replication and Reproducibility” on 2 December, 2011.
Finally, Alberts emphasized that the strength of U.S. science and technology helps to advance both economic progress and national security. He noted that federal investment in fundamental, long-term scientific research has declined from 1.25% of the nation’s economy in 1985 to 0.87% in 2013. “The current [budget] sequester now makes our situation even worse,” Alberts said. “I believe this is dangerous for America’s future, for my grandchildren’s future.”
Panelists at the hearing were asked to address how best to define scientific research data, how and when to share it, technological infrastructure requirements, and the potential costs associated with making data more widely available to the research community.
Testifying along with Alberts were Victoria Stodden, an assistant professor of statistics at Columbia University; Stanley Young, assistant director for bioinformatics at the National Institutes of Statistical Sciences; and Sayeed Choudhury, associate dean for research data management at Johns Hopkins University and Hodson Director of the Digital Research and Curation Center.
Stodden said that federally funded digital archives will be an essential step toward ensuring the integrity of findings in computer science. Discipline-specific data archives would also help to promote innovation, she said. “Availability means curious [science, technology, engineering and mathematics] students, for example, can try their hand at replicating published results from the data and software, and learn about the science (and perhaps contribute to further discoveries),” Stodden said in her written testimony. Stodden and others noted that agencies such as the National Science Foundation require grant applicants to submit a data management plan, yet enforcing such guidelines has been a challenge.
Young expressed concerns about the integrity of published findings and the transparency of supporting data. He cited studies such as a 2011 PLOS One report by John Ioannidis and colleagues, who found that the authors of only 47 of 500 papers (9%) had deposited full primary raw data online. Young called on Congress, funding agencies and journal agencies to “step up and manage the scientific process.”
Choudhury spoke to the need for appropriate technologies to support data-sharing. Specifically, he urged a “layered approach for data sharing, access, and preservation” of data. This would include, for example, repositories for less actively used data, archives that preserve data while also supporting data-sharing, and “dark” archives that preserve content over the longer term without offering access. Hardware upgrades to improve storage systems may be needed in many cases, he said. Standards concerning data-sharing will also be essential. “One of the most important non-technical barriers for sustainable digital access and preservation relates to a lack of awareness regarding comprehensive data management,” Choudhury wrote. “Terms such as storage, archiving, preservation and curation are often used interchangeably and inappropriately.”
Subcommittee Chairman Rep. Larry Bucshon (R-Indiana) opened the hearing by stressing the importance of making federally funded scientific research data broadly accessible while taking steps to reduce the number of mistakes in scientific journal articles. “When there is no reliable access to data, the progress of science is impeded and leads to inefficiencies in the scientific discovery process,” said Buchson, who has a background as a cardiothoracic surgeon. “Important results cannot be verified, and confidence in scientific claims dwindles.”
Both Bucshon and Rep. Daniel Lipinski (D-Illinois), ranking member of the subcommittee, stressed that research-transparency efforts must avoid infringing upon intellectual property, which could hinder innovation. Bucshon said also that improved data access should be achieved “without ridiculous cost or administrative burdens.”