Skip to main content

New Statistical Tools Can Make Sense of Biased Policing Data

police badge and body camera
Knox's study makes use of causal inference to understand imperfect policing data. | Penn State/ Flickr

The lack of progress in identifying and addressing the racial disparities in law enforcement stems in part from inconsistent record-keeping and misleading statistical analyses of incomplete data, according to Dean Knox, recipient of the inaugural 2021 NOMIS & Science Young Explorer Award .

His prize-winning essay illustrates the value of applying new tools and statistical techniques to imperfect data to reveal the extent and severity of racial bias in policing.

"At a time of heightened exposure of police policy and practice, it is intriguing to read how this prize-winning research used causal reasoning to make sense of a complex and incomplete dataset to reveal racial disparities in policing," said Beverly Purnell, senior editor at Science.

Despite decades of high-profile and widely publicized incidents of excessive force against minority communities in the United States and amid growing demands for police reform, courts, policymakers and the public struggle to understand the nature of racial disparities in law enforcement.

According to Knox, an assistant professor at the Wharton School of the University of Pennsylvania, the root of this problem may lie within the data being used to evaluate these questions.

There is a pressing need for methods to make sense of policing data, which is often rife with inaccuracies, selective reporting, and potentially purposefully misleading information.

"We need to be extremely careful when drawing conclusions from messy data. Especially on issues as high-stakes as racial bias in policing, where getting the answer wrong has real consequences," said Knox.

"As we have shown, it can lead to seriously underestimating the severity of the problem, and when we don't get an accurate picture of the problem, it's hard to identify the right reforms to fix it," he said.

Policing presents significant challenges for statistical analysis as most policing data on police-civilian interactions is collected and shared by the police agencies themselves and generally only document incidents that are required to be reported, namely violent or property crimes. While some agencies are increasingly documenting other interactions, such as stops, frisks, arrests and use of force against civilians, many, if not most, police interactions remain unreported to the public.

The use of imperfect or biased data like these leads to contradictory results and frequently undermines our understanding of policing, leading scholars to frequently conclude there is surprisingly little evidence of discrimination or racial disparity.

Making Sense of Imperfect Data

One way to make sense of imperfect policing data is using causal inference — an increasingly important subfield of statistics.

"The goal of causal inference is to understand the how and why that underlies what we can see," said Knox. "Put differently, it aims to say whether things would have unfolded differently if we had done X instead of Y."

Instead of ignoring the inaccuracies, selective reporting and omitted variables in the data, causal inference focuses on what the range of possible interpretations of the data is and what new information could be collected to help narrow this list.

When all potential alternative explanations of correlations are ruled out, the remaining conclusion is causation.

"It's about saying that when officers interact with racial or ethnic group X, they're treated differently from group Y because of their identity — not because of any other factors," said Knox. "It's about ruling out alternative explanations for differences in how the law is enforced."

In other disciplines with similar data challenges, the causal-inference framework has proven invaluable for scholars and policymakers seeking to make sense of imperfect data.

However, according to Knox, in policing research, careful causal analysis remains the exception, not the rule.

"We all see injustice in the world and we all do what we can to address it. Pushing this research forward, and getting it into the hands of policymakers, is how I can help the most," said Knox.

The NOMIS Foundation & Science Young Explorer Award recognizes bold early-career researchers who ask fundamental questions at the intersection of the life and social sciences. The winner receives $15,000 and publication of his or her essay in Science.

"Following up on bold ideas by looking across disciplinary boundaries is particularly risky for early-career researchers who need to prove themselves within the academic system and there are very few who dare," said Markus Reinhard, CEO of the NOMIS Foundation. "NOMIS is very happy to support these few through this partnership with AAAS and Science."

2021 Finalist

Geoffrey Supran for his essay, "Fueling their own climate narrative." Supran received his undergraduate degree from Trinity College, University of Cambridge, and a Ph.D. from the Massachusetts Institute of Technology. After completing joint postdoctoral fellowships at MIT and Harvard University, Supran became a research fellow in the Department of the History of Science at Harvard and the director of climate accountability communication at the Climate Science Social Network. His research focuses on the historical analysis of climate change disinformation and propaganda by fossil fuel interests.