Spurred by a 2013 White House memo, federal science agencies are requiring researchers to make more and more of their data publicly available. The scientific value of these data sets remains uncertain, however. Are there discoveries sitting out in the open, waiting for someone with the right set of analysis tools to dig them out?
This is the question that the Air Force Research Lab (AFRL) has set out to answer through their Materials Science and Engineering Data Challenge, issued in partnership with the National Science Foundation and the National Institute of Standards and Technology. It asks researchers to find novel approaches that use existing, public data to enable the development of new materials.
Materials engineering is a particularly challenging test for this type of data-centric analysis. According to Charles Ward, who is leading the Challenge for AFRL, "a combination of scientific, cultural, and political factors make addressing materials data a complex endeavor."
The world of materials is broad, encompassing everything from metals to biomaterials, as well as spatial and temporal scales that span several orders of magnitude. Both academia and industry, with their different cultures, are active in performing measurements and simulations and there's little standardization across sectors and subfields in how materials properties are described and stored. The resulting information inhomogeneity can pose a barrier to building a robust suite of data analysis tools.
The Data Challenge is the first challenge issued by the Materials Genome Initiative (MGI)—a government-wide effort launched four years ago to modernize the country's approach to materials science and engineering. MGI's goal, given in its strategic plan, is to allow the creation of advanced materials "at least twice as fast as possible today, at a fraction of the cost."
MGI has been successful in coordinating efforts across the field, generating over $250 million in federal research investment and launching four public-private partnerships. The initiative has three interrelated themes, meant to work in concert to speed materials development. Much of the effort so far has gone toward the first theme, "Computational Tools" such as physics-based simulations, and the second, "Experimental Tools" such as high-throughput lab techniques that can rapidly optimize material properties. The third theme is "Digital Data."
"Digital Data is the least well developed in terms of both having an agreed upon approach and an accepted practice in the community," Ward said in a recent email exchange, making it ideal for a challenge to kickstart efforts. While MGI is making progress in getting materials data online, the next logical step was to figure out how to reuse that data—a problem that will be faced by all scientific fields due to a requirement outlined in the White House memo to publicly publish data.
The ultimate goal of the Data Challenge is to outline new paths of knowledge discovery that, ultimately, will allow the many participants in the materials community, and more broadly in science, to make use of one another's work in ways that are currently impossible.
"I think the most significant contribution is going to be diminishing the seams between experimentalists and theoreticians, between materials scientists and other disciplines, and between scientists and engineers," Ward said.
The Data Challenge is open to everyone. Participants enter by submitting a written research report to the Challenge website by March 31, 2016. The program will award a total of $50,000 based on merit, with the top award of $25,000.