Skip to main content


Improving environmental exposure analysis using cumulative distribution functions and individual geocoding

Citation: Zandbergen, Paul A and Jayajit Chakraborty. Improving environmental exposure analysis using cumulative distribution functions and individual geocoding. International Journal of Health Geographics 2006, 5:23. URL:

Background: Assessments of environmental exposure and health risks that utilize Geographic Information Systems (GIS) often make simplifying assumptions when using: (a) one or more discrete buffer distances to define the spatial extent of impacted regions, and (b) aggregated demographic data at the level of census enumeration units to derive the characteristics of the potentially exposed population. A case study of school children in Orange County, Florida, is used to demonstrate how these limitations can be overcome by the application of cumulative distribution functions (CDFs) and individual geocoded locations. Exposure potential for 159,923 school children was determined at the childrens’ home residences and at school locations by determining the distance to the nearest gasoline station, stationary air pollution source, and industrial facility listed in the Toxic Release Inventory (TRI). Errors and biases introduced by the use of discrete buffer distances and data aggregation were examined. Results: The use of discrete buffers distances in proximity-based exposure analysis introduced substantial bias in terms of determining the potentially exposed population, and the results are strongly dependent on the choice of buffer distance(s). Comparisons of exposure potential between home and school locations indicated that different buffer distances yield different results and contradictory conclusions. The use of a CDF provided a much more meaningful representation and is not based on the a-priori assumption that any particular distance is more relevant than another. The use of individual geocoded locations also provided a more accurate characterization of the exposed population and allowed for more reliable comparisons among sub-groups. In the comparison of children’s home residences and school locations, the use of data aggregated at the census block group and tract level introduced variability as well as bias, leading to incorrect conclusions as to whether exposure potential was higher at school or at home. Conclusion: The use of CDFs in distance-based environmental exposure assessment provides more robust results than the use of discrete buffer distances. Unless specific circumstances warrant the use of discrete buffer distances, their applcation should be discouraged in favor of CDFs. The use of aggregated data at the census tract or block group level introduces substantial bias in environmental exposure assessment, which can be reduced through individual geocoding. The use of aggregation should be minimized when individual-level data are available. Existing GIS analysis techniques are well suited to determine CDFs as well as reliably geocode large datasets, and computational issues do not present a barrier for their more widespread use in environmental exposure and risk assessment.

Related Content



Data and Tools