Skip to main content

Predicting the Location of Hidden Graves in Mexico Using Machine Learning Models


Dr. Tal Simmons, Virginia Commonwealth University; human rights representative on the Coalition Steering Committee



  • Presenters
    • Patrick Ball, Director of Research, Human Rights Data Analysis Group
    • Kristian Lum, Lead Statistician, Human Rights Data Analysis Group
    • Mónica Meltis, Executive Director, Data Cívica
    • Jorge Ruiz Reyes, Researcher, Human Rights Program, Universidad Iberoamericana
  • Summary
    • This session presented the work of a group of scholars and activists attempting to collect information on both violence and the violation of human rights crisis (disappearances and homicides) in Mexico and correlate these reports with the location of mass graves in Mexico. 
    • There have been 35,424 missing persons reported, with 98% of cases documented since 2007; the UN Committee on Enforced Disappearances concluded that state actors were committing these crimes.  The slow pace of positive identification of the missing is also an issue; whilst the federal national genetic database and the federal police genetic database contain 42,467 and 6,089 profiles, respectively, they have resulted in only 661 identifications and zero identifications, respectively. 
  • Locating Hidden Graves
    • The stated goal of the project is to identify the municipalities with a high probability of having graves, so as to know where to search.  Sources of information incorporated into the model were reports of mass grave locations from the authorities and reports from the press – in which there is virtually no overlap; data from families that searched for and located mass graves were not included in the analysis, as these are outside the realm of the authorities and press coverage.
    • Other predictive variables explored were sociodemographic, geographic, number of reported disappeared, etc.  The study utilized random forest (RFM) modelling to predict where graves would be most likely to be found; the model worked best when data from the previous year’s probabilities were used to predict the subsequent year.  
  • Limitations and Concerns
    • There was considerable discussion both by presenters and audience members concerning biases and limits of interpretation of the model re: potential tautologies (e.g. not using data from mass graves found if the graves were found using data generated by the model), lack of detail re: actual location of graves within municipalities, and the different and potentially evolving body disposal methods employed by perpetrators, etc.
    • Concerns raised included how best to communicate the results of the predictors to families of the missing, and how this might be used to help families advocate and pressure authorities.  The hope is to use knowledge generated and held by families and in turn to provide additional knowledge to families. 
  • Impact of Work
    • The model’s impact was noted to have:
      • Raised awareness among families of the missing, who have used it to pressure authorities.
      • Been used as a tool for the implementation of a new law concerning enforced disappearances committed by non-state actors.
      • Influenced search and exhumation on a national level. 
  • Discussion and Question Period
    • Additional issues concerning the nature and limitations of the machine learning model itself (e.g. roads are a significant indicator of the presence of hidden graves, but due to the nature of the RFM algorithm, one cannot identify whether it is the presence or absence of roads that is the predictor). 
    • A question was raised as to whether the model could predict whether the perpetrators were state or non-state actors, to which the answer was no.


Key Points/Takeaways:

  • Machine learning can be used to help in uncovering and identifying human rights abuses, particularly in relation to investigations of mass graves.
  • Recommendations for future work:
    • Data from family members who have located grave sites and/or satellite imagery which shows the existence of mass graves must be input into a GIS-based model that provides additional data to the RFM discussed here.
  • Without the precise location of the graves and/or the geographic features that define their location, the model’s utility is limited to the theoretical realm, and not employable in practice re: the actual search, location, exhumation and identification of victims. 



  • Human Rights Investigations
  • Machine Learning
  • Technology
  • South America


Additional Resources: