Orange County, North Carolina Address Import

From OpenStreetMap Wiki
Jump to navigation Jump to search

Goals

To add every missing address in Orange County to OpenStreetMap without creating duplicates.

Schedule

I converted the data into OSM format and removed duplicate addresses. It can be found here. I imported the data on July 1st, 2018. I imported the addresses in 11 small chunks of less than 10,000 addresses to avoid possible import disasters.

Import Data

Background

Data source site: Chapel Hill Open Data

Data license: Open Database License

Type of license: Open Database License (ODbL)

Link to permission: N/A

ODbL Compliance verified: Yes.

OSM Data Files

Already transformed data I imported

Import Type

One time import that was completed with JOSM.

Data Preparation

Data Reduction & Simplification

Tagging Plans

For the source tagging, I used "source"="Chapel Hill Open Data" on each address. I also used addr:housenumber, addr:street, addr:city, addr:state, addr:postcode, and addr:country. addr:unit was used on addresses that include a unit.

Changeset Tags

I used "source"="Chapel Hill Open Data", "source:website"="https://www.chapelhillopendata.org/explore/dataset/addresses/export/?sort=-objectid", and provided adequate comments.

Data Transformation

I completed all needed data transformation using a combination of JOSM, the opendata plugin, and a custom made XML editing Java program.

  • First, I downloaded the data in KML format from chapelhillopendata.org.
  • Then, I opened the data with JOSM using the opendata plugin.
  • I then saved the file.
  • Then, I downloaded every object with "addr:housenumber" and "addr:street" in Orange County using the Overpass API. I saved that data as a file as well.
  • I then ran my address editing program plugging in the new data and existing data. The program corrected casing (CHAPEL HILL -> Chapel Hill), created addr:street ("streetDirection"="N", "streetName" = "COLUMBIA", & "streetType"="ST" -> "addr:street"="North Columbia Street"), and removed duplicates by comparing the dataset to the existing addresses in the OSM database.
  • I opened the file created by the program with JOSM and did some final modifications.
  • Finally, I saved the ready-for-upload files on Google drive so that anyone could view them.

Data Transformation Results

Transformed data.

Data Merge Workflow

Team Approach

I completed this import alone. It was very simple, because it only involved adding nodes to the database.

Workflow

For each of the 11 smaller address OSM files (available on Google Drive):

  • Download from Google Drive.
  • Open with JOSM.
  • Upload. (Each file 5000 - 9000 addresses, so the max changes limit of 10K will not be a problem)

I used the account LeifRasmussen_import to do the import. This helped make undoing part of the import much easier.

Conflation

My data transformation Java program automatically removes duplicates of existing addresses from the dataset. Conflation was not a very large problem.

How it went

I uploaded each file throughout the day and only ran into trouble with the 10th upload (of 11). My internet connection broke half way through, so I had to close the changeset, delete the addresses added in the changeset, and re-upload the addresses. This did no harm to the OSM database. Other than that, the import went wonderfully.