Skip to page navigation menu Skip entire header
Brown University
Skip 13 subheader links

Mining Spatio-temporal Data on Industrialization from Historical Registries

Description

Abstract:
Despite the growing availability of big data in many fields, historical data on socioevironmental phenomena are often not available due to a lack of automated and scalable approaches for collecting, digitizing, and assembling them. We have developed a data-mining method for extracting tabulated, geocoded data from printed directories. While scanning and optical character recognition (OCR) can digitize printed text, these methods alone do not capture the structure of the underlying data. Our pipeline integrates both page layout analysis and OCR to extract tabular, geocoded data from structured text. We demonstrate the utility of this method by applying it to scanned manufacturing registries from Rhode Island that record 41 years of industrial land use. The resulting spatio-temporal data can be used for socioenvironmental analyses of industrialization at a resolution that was not previously possible. In particular, we find strong evidence for the dispersion of manufacturing from the urban core of Providence, the state’s capital, along the Interstate 95 corridor to the north and south.
Notes:
This research was supported by the National Institute of Environmental Health Sciences (NIEHS) Superfund Research Program of the National Institutes of Health under award number P42ES013660.

Access Conditions

Use and Reproduction
© Copyright 2016 the authors. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International license

Citation

Berenbaum, David, Deighan, Dwyer, Marlow, Thomas, et al., "Mining Spatio-temporal Data on Industrialization from Historical Registries" (2016). Superfund Project: Socio-environmental Cities, Brown Superfund Presentations & Publications. Brown Digital Repository. Brown University Library. https://doi.org/10.26300/cqzk-f016

Relations

Collections:

  • Superfund Project: Socio-environmental Cities

    The Community Engagement Core (CEC) advances social science of environmental health and justice through a deliberative and participatory process of research, education, and advocacy in the state of Rhode Island. Combining academic and community-based approaches builds mutual trust and promotes …
    ...
  • Brown Superfund Presentations & Publications

    This collection contains research articles, conference papers, research posters, and slide presentations associated with Brown University Superfund Research Program's investigators and research projects.
    ...