Science & Human Rights Coalition Meeting: Big Data & Human Rights
Throughout the first day of this meeting, participants deepened their knowledge about emerging human rights opportunities and concerns connected to Big Data, especially the implications for the work of scientists and engineers. Sessions explored how collection, analysis, and access to massive data sets can impact human rights, both positively and negatively, and identifed ways in which human rights principles offer guidance for responsible data use.
The meeting also hosted a workshop on how members can effectively inform their organizations about the Coalition’s many resources and, more generally, about the intersection of science and human rights.
Since the launch of the AAAS Science and Human Rights Coalition in January 2009, Coalition meetings have convened scientists, engineers, and health professionals with human rights leaders and policy makers to discuss emerging issues at the nexus of science and human rights. The Coalition serves as a catalyst for the increased involvement of scientific and engineering associations and their members in human rights-related activities.
Welcoming Remarks and Opening Plenary Session: What’s New about Big Data?
Jessica Wyndham, Associate Director of the AAAS Scientific Responsibility, Human Rights, and Law Program, welcomed attendees to the meeting. She described the role of the Coalition in the fields of science, technology, and human rights, urging members to recognize the connections between the two fields. Wyndham reported on several of the Coalition’s accomplishments since its last meeting, including statements by many Coalition member associations urging the U.S. to ratify the Convention on the Rights of Persons with Disabilities. Wyndham described the human rights activities of several member associations, including the American Chemical Society, American Sociological Association, the National Center for Science and Civic Engagement (NCSCE), the Council on Undergraduate Research, the American Psychological Association, the American Political Science Association, the Society for the Psychological Study of Social Issues, Sigma Xi, the American Association of Geographers, the American Physical Society, and the Linguistic Society of America. These projects ranged from webinars on the connections between science and human rights to Congressional briefings to research publications to work on behalf of the rights of members of a specific scientific discipline.
Kavita Berger, Associate Director of the Center for Science, Technology and Security Policy at AAAS, described big data and its risks, technologies, and implications for human rights, through the lens of security. Several projects have emerged from these connections, including an evaluation of the use of chemical weapons in Syria and, overall, the risks and benefits of big data’s advancement in the field of natural sciences and how to solve these problems. She used Tim Gartner’s definition of high volume, high velocity, and high veracity information to describe this emerging technology, and noted that both the Podesta report on privacy (2014) and the President’s Council of Advisors in Science and Technology (PCAST) report stated that the term ‘big data’ has diverse implications for individuals, groups, and industry sectors, and that there is a need for privacy protection in this mix.
However, Berger clarified the previous definition to include variety, in addition to velocity, veracity, and volume, to indicate that data come from more than one source, is collected over time, and is integrated, added to, and deleted in a heterogeneous manner. Some of the data received are structured, such as efforts to define the human genome. Unstructured data, like feeds from Twitter, are diverse in their degrees of error, size, organization, and incorporation into different data sources. Records of data extend from the area of healthcare and disease, to cell phone and travel logs. Similarly, big data analysis technologies are also varied and incorporate how to standardize and analyze this information.
Throughout her program’s recent work on big data, Berger explained, they consulted with professionals working in healthcare and environmental sciences about the challenges, advances, and solutions to problems in big data. The problems outlined include a lack of standard terminology and languages, lack of access to needed technological structure, creating analysis technologies to support larger troves of stored and shared data, and research and development to investigate and improve cyber security, without reliance on specific datasets.
In the context of national security, Berger explained, the risks and benefits of such technology are not routinely assessed together. The two primary risks are the vulnerability of the system and the potential for malicious abuse. However, risks are present in both individual and population levels, in the potential for intentional harmful use, invasion of privacy, and discrimination. Furthermore, the determination of the benefits for this use must include the consideration of an individual’s human rights, freedoms, and liberties, in comparison to the benefits of security.
For example, Berger noted, through the On-Call Scientists program at AAAS, Human Rights Watch was able to evaluate the use of chlorine barrel bombs in Syria using YouTube videos, satellite imagery, and subject matter expertise. Satellite imagery has also been used to map the location and known or suspected sites of human rights violations.
Plenary Session: Human Rights Implications of Big Data
This session, moderated by Jay Aronson, highlighted the unique commodity role of big data in discovering and addressing violations of human rights, as well as its use in prevention of these violations. Speakers explained the ways in which big data had human rights implications in privacy and transparency; how big data could be used to discover and solve problems of human rights violations; and the potential risks of using big data and violating human rights.
Emmanuel Letouzé, founder of Data Pop Alliance, reflected on the implications and applications of big data for development. He redefined big data as community, capacity, and crumbs: the persons involved in the process, the computer power, and the pieces of data involved in this idea, respectively. However, gaps in technology and understanding are also an important part of the progression, which raises ethical, political, and legal questions. Letouzé noted that an important risk of organizing data is the potential for re-identification and de-anonymization of the information. There is also great importance in the person’s consenting to the release of data and the distinction of the benefits outweighing the risks in the process; often, this is unclear. Letouzé explained that societal, commercial, and individual considerations must be included as key factors in the process of identifying property rights, data control, and overall transparency. Human rights, especially the rights to agency and participation, should be a primary component in development of, use of, and policies surrounding data.
Samir Goswami, Director of Government Professional Solutions at LexisNexis, described the vast quantities of data made available by emerging technologies and the potential for this information to advance and uphold the rule of law. Goswami outlined a joint project of Amnesty International and Purdue University that catalogues and digitizes documented human rights abuses to make the information more easily available. He also described the potential to bring together data for scientists to be able to engage in social good by predictive testing and monitoring of human rights violations and encouraging communities to respond. However, there are certain ethical and legal risks posed by engaging in and responding to these questions, in addition to potential use of this data for malicious intent. Goswami described a pilot project at LexisNexis to promote increased awareness of government supply chains through the collection of information from workers in corporations to assist both sectors in making responsible ethical decisions. He briefly spoke about his work at the Chicago Coalition for the Homeless and using big data to determine the prevalence of youth homelessness in the state of Illinois and finding that the current facilities were grossly inadequate. However, that problem has yet to be solved. He emphasized political will in addressing issues as a more important component than just having the mechanisms and information to describe social phenomena.
Jeramie Scott, National Security Counsel at the Electronic Privacy Information Center, discussed the emerging privacy issues associated with big data, including the right to privacy and freedom of thought, among numerous other human rights. Increasing amounts of data—especially through mechanisms of social media and larger connected systems, such as toll collectors on highways—can lead to greater discrimination, manipulation, and lack of protections. Considerations to protect these rights involve legal, political, and ethical systems. He described examples of potential criminal determination mechanisms; the modes of determination of the “no-fly” list; price discrimination by location; and manipulation by behavioral advertising. He acknowledged that, while big data can be used for good, the potential pitfalls must be recognized, especially in the context of human rights. The risks of information collection and potential analysis must be considered before any activity, with algorithmic transparency an essential component of these transactions.
Introduction to the AAAS Science and Human Rights Coalition
Plenary Session: Big Data in the Service of Human Rights: Opportunities and Responsibilities and Closing Remarks
This session, moderated by Patrick Vinck of the Harvard Humanitarian Initiative, examined the importance of responsibility and thorough examination of the benefits and risks associated with the use of big data.
Mark Latonero, Research Director of the Data & Society Research Institute at the University of Southern California, discussed the complexities of the applications of big data for human trafficking. Digital technologies have proven to be an unprecedented window into the world of these networks of advertisement, sales, and control of consumers, by way of digital traces through online behavior. Using this potential, Latonero argued scientists and human rights officials can interrupt the process. Using big data, scientists can understand human behavior in the aggregate and work with decision makers to create solutions for human rights violations. He discussed the example of public online advertisements for commercial sex and how scientists were able to analyze the data to determine the geographic distribution, contact information, and individuals involved in trafficking rings. He also discussed the potential involvement of the private sector in contributing to the collection and analysis of relevant information, such as money transfers, through their own collection of data, to combat human trafficking and exploitation. Latonero addressed the complexities of potential over-surveillance, the responsibility of reporting these crimes, and the potential bias in tools used to examine troves of data. However, the most important component of these measures is to focus on improving conditions for the vulnerable, victims, and survivors through multi-sector cooperation, without further violation of human rights. The Internet, as a discovery ground for documentation of human rights abuses, must be further explored and used to find and contribute to addressing human rights violations.
Megan Price, Director of Research at the Human Rights Data Analysis Group (HRDAG), analyzed the importance of unconventional data sources in the discovery and documentation of human rights, describing documents such as communications between the secret police and the president in Chad, police documents of daily activities in Guatemala, and border crossing records in Albania. The primary challenges of using these data include document preservation and the protection of victims and witnesses. Furthermore, relationships between recorders, victims, witnesses, incentives, and security can create bias. For example, she detailed the relatively well-documented conflict in Iraq from 2003: most of the data has been acquired from media sources, so the examiner must acknowledge potential discrepancies. Higher reporting or documentation of events appears to correlate with a greater number of victims; thus, visibility of particular events can be misrepresentative of what took place and those injured. Price noted that big data is not synonymous with complete or representative data, and that scientists and human rights activists must question how the data was collected and what is missing from those datasets. When complete, data can contribute to a narrative about a country and used in a court of law.
Kalev Leetaru, Founder of the Global Database of Events, Language, and Tone (GDELT) Project, described his project on documenting media and predicting the mood and situation of countries around the world. However, this information can also be highly unrepresentative, especially if only a single organization collected and analyzed the information. Social networks of citizens on the ground, more so than governments, are dictating the location, time, occurrence, and persons involved, especially through the increasing penetration of mobile devices. Documentation of natural disasters, for example, and what people feel, observe, and think is increasingly available. Leetaru described the potential of using similar information to predict the pattern of human protests, emotional flow of disease outbreaks, or movement of drug cartels in a running catalogue of society. This can prove extremely useful, even with the potential biases of the media. For human rights violations, activists can collectively examine these reports from around the world with updates on location and emerging situations.
Max Richman, Chief Data Scientist for Geopoll and head of the D.C. Chapter of DataKind, provided closing remarks. He provided background on DataKind which organizes a network of volunteers to solve data problems for social good. He then recapped some of the ideas discussed throughout the day. He reminded the audience of the “four Vs” of big data—volume, variety, velocity and veracity—but said that they all raise the ultimate fifth V—the question of value. He also noted the importance of striking a balance between the risks and benefits of data. While big data presents major concerns, particularly to privacy, it offers significant opportunities for the advancement of human rights. He closed encouraging the audience to contemplate the promise and problems associated with modern data technology.