Results and Responsibility: Science, Society, and GPRA
Susan E. Cozzens
The first purpose of the Government Performance and Results Act (GPRA) is to increase the confidence of the American people in government in general and, specifically, to increase the confidence of the American people in science and technology, or at least in the federally funded portions. This is a task that members of AAAS will find very familiar. Most scientists and engineers entered their professions with public service in mind. Overwhelmingly, they work on topics that either make a difference in everyday life (like food, health, environment, or energy) or they are engaged in a process of discovery that sparks the imaginations of children old and young. Most scientists are vitally concerned with the social context and consequences of their work. They try to choose research programs that will help answer fundamental questions, and they pay attention to the ethical standards of their professions and the moral standards of their communities. Most of us would agree that these are the kinds of actions and attitudes among scientists and engineers that win public confidence. In old-fashioned terminology, they fall into the category of "social responsibility of science."
The question I would like to explore is: Will GPRA as it is currently being implemented make scientists and engineers better at winning the confidence of the American public? Will it make them more socially responsible? Perhaps surprisingly, my answer is yes, it could. Directing attention toward defining outcomes (that is, what research delivers to the public) and asking agencies whether they are moving toward those outcomes are fundamentally healthy processes for science. We have only to listen to the goals in the strategic plans that apply to federally sponsored research to see clearly that the public has a stake in this activity. (I provide only a small selection here.)
These are, indeed, the kinds of goals that we in the science and engineering community have dedicated ourselves to. GPRA provides an opportunity to see and show clearly how we are contributing to them. Unfortunately, such broad, strategic aspects of the GPRA system are in real danger of being lost to view in the current attention to quantification and annual performance goals. My objective is to bring them back into focus.
Most of us have heard of GPRA only within the last year or so, but a few of us have watched the full history of its impact on research agencies. I want to start by briefly relating the story so far of GPRA and the research community, from one up-close observer's perspective. If it were fully told, the story would not be a pretty one. Some would see its main character as a group that receives billions in public support but had to be dragged kicking and screaming into compliance with this law. Others would explain the screams as those of a maiden in distress, tied to a log that is floating toward the sawmill. For me, the main character of the story is Mr. Magoo, myopically wandering through the cityscape, always still alive, but by the sheerest of accidents.
Phase One: Agency Learning
In the summer of 1993, when GPRA was first passed, I was doing a set of interviews with officials in the six major research funding agencies on how they evaluated their research programs. There were two main patterns, which have persisted into GPRA responses. The mission agencies (e.g., the Department of Energy [DOE], the Office of Naval Research, and the Agricultural Research Service) basically did retrospective program assessment through some form of expert review panels (sometimes looser, sometimes more structured, but usually on a regular schedule) that covered all activities. The fundamental research agencies (the National Science Foundation (NSF) and the National Institutes of Health), in contrast, placed great faith in the proposal competition process to maintain the quality of their funding portfolios, and only occasionally undertook retrospective evaluations of program-level (aggregate) results.
In the fall of 1993, most of the agency assessment officials I was in contact with had not yet heard of GPRA. That fall, a congressional fellow who had just finished a year with the House Science Committee briefed the Federal Research Assessment Network on GPRA. Many of those present heard for the first time about the existence of the law. The National Science Foundation was an exception: It was already planning to volunteer GPRA pilot projects. DOE headquarters was also gearing up for GPRA, although the research assessment office in Basic Energy Sciences had not been informed. The National Oceanic and Atmospheric Administration was already experimenting with performance-based budgeting. There was one person at the Office of Science and Technology Policy (OSTP) following the implications of the law for research. When she left the following spring, she asked me to make the case to MRC Greenwood for OSTP guidance to the agencies. Greenwood took up the challenge.
The following year saw two parallel efforts to define possible approaches to GPRA for research. One was the OSTP process: a formal, interagency consultation resulting in a White Paper by the National Science and Technology Council.1 The process for the first time involved high-level decision makers from research agencies in discussions of the law (discussions which were considerably more skeptical than those of their relevant staff). Of course, such a process was slow, and the resulting consensus gave less practical guidance and made more high-level pronouncements than many would have hoped. But at least agencies were officially engaged. The central messages that emerged in the White Paper were that (1) no convenient outcome measures were available (more research was needed on this topic) and (2) one size of performance indicators would never fit all agencies.
A second effort was informal, started by staff in the HHS budget office and soon involving dozens of lower-level agency personnel. This process, called the Research Roundtable, considered the practical specifics of performance planning and linking performance goals to the budget. Interestingly, this group came to the opposite conclusion of the OSTP process. Its report stated, "The results of research program performance can be measured," and it recommended a template for assessment that it claimed could be adapted to all kinds of research activities. While the OSTP process revealed the gap in GPRA approach between top-level decision makers and their evaluation staff, the Roundtable process revealed the gap between budget office staff and planning and policy staff. In some agencies, one group was in charge of GPRA, and in other agencies the other was. This gap has persisted. At a recent workshop on performance plans, one agency reported that late in the process of developing their document, the planning office discovered that hundreds of performance goals had already been transmitted to Congress in the budget itself. Another policy office claimed no knowledge of what "the controller's office" had written about science and technology in their department's performance plan.
By spring 1995, these two parallel processes had so sufficiently informed agencies about the existence and requirements of the law that a standard litany of its challenges with regard to research had emerged. In my view, there are two key points. First, research in general and research funding in particular are activities that produce outcomes over long time scales and at unpredictable intervals. They therefore do not lend themselves conveniently to setting goals to be achieved in a particular fiscal year. Second, while many aspects of research activity are measurable, quality is not among them, so any system that focuses obsessively on measures will leave out what is most important about research. Frequent repetition of these points can begin to sound like whining to those charged with implementing GPRAeven if the points themselves are true.
Phase Two: OMB and Congressional Feedback
In spring 1995, the Office of Management and Budget (OMB) began to ask agencies for the first elements of their GPRA frameworks, by requesting performance goals and indicators for as much as possible of each agency's activities. From that point, OMB gradually scaled up its requests for GPRA-related information, working toward the full implementation deadline of September 1997. In the early stages of this process, OMB staff said they did not want to be prescriptive about GPRA approaches, although they consistently stressed that performance plans should include measures of good management. But beyond this generic point, the OMB scale-up process apparently produced little or no feedback or guidance to agencies until the very final stages of production of each document (and in the case of performance plans, after the GPRA deadline). (Staff at one GPRA pilot agency told me that they had never received any feedback on any of the three performance plans and performance reports they turned in.) Budget examiners for research agencies seemed to consult informally among themselves, but agencies continued to receive very different signals from different parts of OMB about what was acceptable and not acceptable in strategic and performance plans.
In contrast, once congressional staff took up the task of responding to GPRA requirements, they were very specific and standardized in their evaluations (which have taken the form of the now-infamous rating schemes).2 Unfortunately, as with the different parts of OMB, the various committees overseeing each agency did not necessarily ask for the same things. Many agencies reported that their authorizing committees were much more interested in GPRA implementation than were their appropriating committees. And what congressional committees wanted was not necessarily the same thing that OMB wanted. Fortunately for the public, congressional requests were more likely to focus on outcomes than on the measures of good management asked for by OMB. Unfortunately for the public, some congressional staff seemed to be particularly blind in their insistence on quantitative performance goals, even if these obscured longer-term processes. They thus set up a danger of getting stuck in the position of insisting on output measures (for example, publication counts).
The Result: Short-termism
A pincers movement was thus established, squeezing agencies between OMB's stress on management measures and congressional staff's search for annual, quantitative targets. The results are now in, in the form of performance plans. For research agencies, in my view, they are alarming: The short term has decisively won out over the long term, and occupies the bulk of the plans. The strategic view has been lost.
I reviewed the research elements of the performance plans of 10 agencies recently for a National Research Council workshop. The short-termism was visible in several forms.
First, short-termism appears in the large number of plans that use the roadmap/milestone format for performance goals. These performance plans lay out a set of technical targets and specify what steps toward them will be taken in the specific fiscal year for which the plan is being written. This clearly places the emphasis on the predictable, and keeps the agency focused on what will happen over the next 2 yearsbetween the time when the budget is written one fall and the time the target fiscal year ends two falls later.
Second, performance goals focused on outcomes are in short supply, while goals for agency activities predominate. This is exactly the opposite of GPRA's intention, which is to shift attention away from agency activities and toward the results being produced for the American public. Among the performance goals in the sample I collected for the Academy workshop, about 30 percent focused on agency activities, such as putting programs in place or making grants. About half focused on outputs, including the milestones already mentioned, as well as tangible but immediate products of research. Only 20 percent, by a generous definition, focused on outcomes, and over half of these were phrased qualitatively. If several agencies had not clung tenaciously to qualitative formats for performance goals, the percentage would have been even lower.
Among mission agencies, the plans that seem to have withstood short-termism best are those that do not focus in detail on research, but instead state simple goals for the contribution of research to larger mission efforts. The performance plan of the Department of Defense (DOD) could serve as an exemplar of this type. In justifying $7.2 billion in S&T spending, it articulates three performance goals, which fit conveniently into a single paragraph: First, keep investment at the level requested in the President's budget (presumably despite pressures to divert funds to more immediate, practical purposes); second, make progress toward DOD technical objectives; and third, fund projects on leap-ahead technologies. Is there really anything more we need to say?
Phase Three: The Research Community Reacts
In the meantime, the extramural research community has gradually become more aware of the existence of GPRA. For those of us who have been around for a while, what is striking about this spread of information is that everyone who meets the Act for the first time seems to need to go through the same stages that their predecessors did. As in facing the death of a loved one, one goes through shock, disbelief, and denial, to grief, and finally to acceptanceand healing. It is presumably a sign of progress that folks are going through this process faster now than we did in the first wave.
If this describes the pattern of diffusion, then the future of the research community's involvement in GPRA is foreseeable from the present experiences of leadership in the research agencies. There, the recent experience with strategic planning is encouraging. One industry executive on the National Science Board told us at NSF that groups don't "get" strategic planning until the third time through. The first time, people tell you what they do; the second time, they tell you they are going to do what they do, and the third time, they actually get strategic. It was clear to me at NSF that we were actually getting strategic even on our second time through. We did come down to Earth from the "lofty" goals in our first strategic plan. We were clarifying what we wanted to deliver to the public. We were asking ourselves in a serious way how we would know whether we were delivering those observable outcomes. And we did begin to build those clarified expectations into a whole variety of existing assessment processes at the Foundation, including the criteria used by proposal reviewers, questions grantees answer in final reports, and questions addressed by committees of visitors doing ex post assessment of programs. This last stepbuilding the expectation of outcomes into the everyday processes of an agencyis the key to effective GPRA implementation.
I am proud that at NSF, we did not succumb to short-termism in that process. And as a result, I can state confidently that NSF in its GPRA implementation is in fact encouraging social responsibility, rather than mere accountability, among its grantees. If NSF can do it, other agencies can, too.
My advice to those guiding Results Act implementation is to stop and smell the roses. Or perhaps the flower I have in mind is the marigold, from the old song: "Inchworm, inchworm, measuring the marigolds, seems to me you'd stop and see how beautiful they are."
1. "Assessing Fundamental Science," July 1996 (http://www.nsf.gov/sbe/srs/ostp/assess).
Susan E. Cozzens is a professor in the School of Public Policy, Georgia Institute of Technology. This article is based on remarks delivered at the 23rd Annual AAAS Colloquium on Science and Technology Policy, held April 29May 1, 1998, in Washington, DC.