In recent years, there’s been growing interest in finding ways to better evaluate the public science and technology enterprise: how it works, how it could work better, and what we’re ultimately getting out of it. Now, a group of experts is hoping to take a step forward on some of these questions through a new research endeavor, dubbed the Institute for Research on Innovation & Science, or IRIS.
Housed at the University of Michigan’s Institute for Social Research, IRIS is creating a data platform incorporating an array of metrics on the science and innovation system, from individual participation and career trajectories, to material and equipment purchasing patterns, to publication and patent outputs over the long term. The project hopes to further connect and expand the community that’s engaged in these efforts, building on previous federal and academic pilot efforts. The project was developed via the Committee on Institutional Cooperation, a consortium of the Big Ten schools and the University of Chicago, and has received funding from the Kauffman and Sloan foundations.
The AAAS R&D Budget and Policy Program recently spoke to Dr. Jason Owen-Smith, Professor of Sociology and Organizational Studies at the University of Michigan and Executive Director of IRIS, and E.J. Reedy, Director of Research and Policy at the Kauffman Foundation, to learn more. The below transcript has been edited for length. Visit iris.isr.umich.edu for more info.
AAAS: Jason and E.J., thanks so much for speaking with us today. Now, the ultimate goal of IRIS is to build our understanding of the science system. What do we know about the value of public research, and how does IRIS fit in?
Reedy: I think in terms of the public research enterprise, a lot of the information we know is still fairly opaque…From our perspective at the Kauffman Foundation, where we have an interest in understanding the processes and procedures that lead toward high-growth entrepreneurship, one of the reasons we’re a supporter of this project along with the Sloan Foundation is that we know a lot of the value and commercialization of the scientific enterprise happens through individuals – whether they’re graduate students that become involved in scientific research at a university, or transition out into a startup – and so we’re really excited to see the IRIS project is beginning to put in place a platform to really understand with greater granularity all of the individuals involved in making scientific research occur, and looking at some of these things across time and in platforms that can be built on top of what’s been done before.
"The data can let us see how differences in funding mechanisms feed into differences in the characters of scientific teams, and what those differences mean for the knowledge they produce."
Dr. Jason Owen-Smith, IRIS Executive Director
Owen-Smith: What we know about the value of research right now in economic terms largely comes from either very, very labor intensive data analyses, often of exceptionally high quality but hard to replicate or expand upon; or it comes from macroeconomic estimates that don’t really reach down into the actual nitty gritty of the production of science. So IRIS offers a new way to think about this that emphasizes not just grant inputs and publication or patent outputs, but also the ways in which research teams and individuals use multiple grants to build a sort of social infrastructure, a set of projects that sometimes cross multiple careers, and that simultaneously produce new knowledge and train skilled people who can apply it. And so by being able to systematically try to follow the movements of these science-trained individuals, these science-experienced individuals out into the economy, and to understand the broad range of their entrepreneurial activities, the value they bring to established organizations they end up getting jobs in, and their later discoveries, we have the capacity to really fill in the micro mechanisms, which has real potential immediately as mouth-watering fundamental social science, but also has real implications for policy and for thinking about research portfolios and the value of institutions like universities for society.
And can you talk a little more about what we don’t know, and perhaps some things that we need to know, for better policy or funding decisions around science, and how IRIS might be able to contribute to some of those areas?
Owen-Smith: The very obvious example is to think about micro-level means of more accurately estimating the value of public investments. But there are a lot of things about the scientific process that we don’t know. For instance, I’m a social network theorist, and I’m very interested in the ways in which collaborative relationships form and support discovery and innovation, and the ways in which different campuses, even if they have the same mix of fields, end up for a variety of reasons having differently structured networks. So one thing we’ve never been able to see before, but that IRIS could allow us to, is an intra- and inter-campus collaboration network that spans many fields, that is based on data that include all of the people engaged in the production of science – students, technicians, people who might not show up on author lists, but who nevertheless are important to the ability of a campus or an organization generally to do scientific work and to move from the “ah-hah” moment of a new idea to an actual scientific finding, which is a lot of very uncertain work.
The kinds of data we’re taking can let us potentially see with really fine-grained detail how differences in funding mechanisms, for instance, feed into differences in the characters of scientific teams, the networks where people work and are trained, and what those differences mean for outcomes in terms of the knowledge they produce or their later career trajectories. So we can imagine interjecting into that kind of debate a bit more systematic evidence than we have now.
The basic structure of the IRIS network | Credit: IRIS website
Tell us about the genesis of the IRIS project and how it came together.
Owen-Smith: The basic history begins in 2008 with the recession and soon thereafter the federal stimulus package, ARRA [the American Recovery and Reinvestment Act of 2009]. One of the things ARRA did was push a big bolus of money into federal R&D spending, mostly through NIH and NSF but through other agencies as well. That came with a bunch of new reporting requirements which universities were having trouble managing. And so a group of people in the federal government at the time, led by Julia Lane [then with NSF, currently a fellow at the American Institutes for Research, and a AAAS Fellow], started a program called STAR METRICS. It was a federal program where universities brought in a limited swath of data on their expenditures to the NIH.
Graphic from a recent report on the regional effects of federal funds using UMETRICS data. Download for more. | Credit: CIC
That project was a very successful demonstration that useful information about science could come from university administrative records…So a group of us in 2013, at Julia’s impetus after she left the federal government, formed an initial partnership with the Committee on Institutional Cooperation, or CIC…The result of that was the beginning of what we called the UMETRICS project, which is an initiative to take the kind of data that was being sent to the federal government under STAR METRICS and share it under strict confidentiality requirements to see what we could do with it in research terms. The early products of that, for instance, Julia did a congressional testimony, [and] we published a small Science Policy Forum piece. Basically some demonstrations of value that led universities to be interested in expanding possibilities for the data, and to think about forging partnerships with the U.S. Census Bureau.
Reedy: Some of the vestiges of this project come out of things like the LEHD [Longitudinal Employer-Household Dynamics] at Census, which is a program that brought together the states to consent and provide data to better understand how workers and businesses change and adapt over time…And the Sloan foundation also initially funded some proof-of-concept work at Caltech and I think that was really helpful. We’d followed STAR METRICS with interest but also seen some of the frustrations and limitations to what was occurring with that project.
Owen-Smith: The other thing is that the team that came together around this is not one that I built, but it brought together a bunch of really strong folks working in complementary areas. There’s Julia whose history with the Census, and privacy and confidentiality, and the STAR METRICS program is essential. Our collaborator Bruce Weinberg at Ohio State is the PI of a large program project grant out of [the National Institute on Aging] on the effects of the aging scientific workforce. Barb McFadden Allen at the CIC is an essential component for working with the universities. James Evans at Chicago has deep expertise in understanding the contents of science using natural language processing and computational linguistics techniques. And Ron Jarmin at the U.S. Census Bureau is one of the world experts on this as well as the person in Census who’s responsible for many of Census’s economic research and statistical products. Without that mix of skills this wouldn’t have gotten off the ground.
What data are you tracking, where is it coming from, and how will the platform ultimately work?
Owen-Smith: The core data for the UMETRICS project, which is what IRIS is building out, are administrative data from universities that are essentially record-level information about expenditures out of federal grants. Most of the direct cost expenditures of grants go for personnel, supplies and services, or subcontracts. So the data are information drawn from sponsored projects, HR, and procurement records at university campuses that allow us to get a statistical portrait of who’s doing the scientific work, who are the people being paid the wages, what are the inputs, what materials are bought from what kinds of vendors in what places, and how do the collaborations work. Bringing in those data allows us then to use a variety of sources of public and restricted data ranging from publication information in places like Medline or patent information or details of the grant from NIH Reporter, and also more proprietary and restricted information, which is where the Census partnership comes in.
"As we start to disentangle all the different roles in the science process, we can more effectively design policy and programs to serve the different communities involved."
E.J. Reedy, The Kauffman Foundation
We then do the work of cleaning, integrating, documenting and validating those data to create a multi-university data set that has potential value for researchers. That is pushed to Census where additional work is done with it, and then we de-identify the data to ensure confidentiality and move it into a researcher-accessible environment that we’re building right now, which will be a virtual data enclave that researchers can access under the terms of a data use agreement that’s currently being drafted. The data Census is developing will be made available to researchers through the Census Research Data Center system. Once Census data has been integrated with these administrative data, they can only be accessed under the very strict protections that the Census Bureau holds, so that’s another important partnership because Census data isn't managed by IRIS.
My aspiration for this would be to have it become the center of a national or international community of researchers working in these areas.
Ultimately, what is the upside to having all of this information? What kinds of impacts might we see for university decisions, appropriator decisions, stewardship of tax dollars, et cetera?
Reedy: It’s really very broad potentially. As an example, I already talked a little about the parts of the scientific process and the people in the process who we don’t know very much about, people like postdocs or people that are involved in nuts and bolts of carrying out a lot of this research. As we start to disentangle and give a face to all the different roles involved in the process, we can much more accurately and effectively design policy and programs to serve the different communities that are involved, and to make the process more efficient.
There’s a lot of debate around the scientific enterprise and some of the people like postdocs that are involved in kind of a shadow economy for lack of a better comparison, because they’re not very well measured and they’re not very well tracked, and yet I think they’re very important in providing contributions…It shouldn’t be missed that universities wouldn’t likely be participating if they weren’t getting new insights into their own activities, and new insights that are comparable and benchmark-able for themselves. So there are returns at the individual level, there are returns at the societal level.
Owen-Smith: The only thing I’d add is, in some ways another historical genesis of this is the move started under Jack Marburger when he was the science advisor to President George W. Bush to think about what it would mean to develop a science of science and innovation policy, to think about what it would take to actually put on firm and rigorous evidentiary ground the kinds of decisions that have to get made by appropriators, that have to get made by vice presidents of research, or university presidents, or state legislatures as they’re thinking about investments in the public universities of this nation. And so I think for me the real promise of the data, the partnerships, and the scientific community we’re hoping to help support and expand is the idea that this is a place where really rich fundamental research can have pretty immediate implications for hard decisions about resources that I think are really essential for society.