Import/Catalogue/Norway Building Import

From OpenStreetMap Wiki
Jump to navigation Jump to search

Introduction

Building polygons became available as open data in Norway in 2020 when the mapping authority made it available as part of its EU INSPIRE delivery. The dataset contains high quality 2D polygons of the building footprints, as well as building type, from the Cadastral registery of Kartverket, the Norwegian mapping authority. There are 4.3 million buildings in the dataset. This page describes the plan to import it into OSM.

As of February 2020, there are 1.1 million building polygons in OSM for Norway. A large number of them have a big offset due to bad orthogonalisation quality of the Bing imagery and other satellite based imagery for the region. Orthophotos were not available for OSM until 2018.

Goals

The goals of this import are to:

  • Import missing buildings in Norway into OSM.
  • Improve the quality of existing buildings in OSM (offset etc).
  • ... while keeping existing buildings which are correct, including preserving tagging information for existing buildings.

Schedule

Import data

Background

The building data is available through three datasets from the Cadastral registry ("Matrikkelen"):

  • Building points - Building information/attributes, including building type - GeoNorge catalogue.
  • INSPIRE Buildings Core2D - Polygons of building footprints - GeoNorge catalogue.
  • Address apartment level - Contains information about building levels for a subset of buildings, mostly apartments - GeoNorge catalogue. The subset used is "bruksenhet" (occupancy unit in a dwelling).

License/permission information:

OSM data files

Data files will be available per municipality. Examples of generated geoJSON files with OSM tagging are available in this folder, and examples of source GML and CSV data files are available in the source sub folder. The number of buildings in the source data and in OSM for Norway is provided on the import progress page.

Data quality

The data quality is very high in general, and more accurate and detailed than what would be possible to trace from high quality orthophotos. The Cadastral registry is the official source of property rights and the basis for real estate taxation in Norway, so it is being kept up to date, on a daily/weekly basis.

As usual, there are a few issues to look out for:

  • Buildings under construction are provided only with one node (will not be imported).
  • Delays in updating data could result in some outdated polygons in cases were a building has been replaced or extended.
  • Some old buildings are only included as a node (will not be imported).
  • Some buildings may not be included in the data set (which is not a problem).
  • Buildings which have been demolished, burnt down or become ruins are not supposed to be included in the dataset, but exceptions may occur.
  • Some buildings are not included due to national security.
  • The provided building type may be outdated, for example a school may have closed down and is now a civic centre.
  • The building type may not be fully representative for a mixed-use building, such as a building with retail stores on the ground floor and apartments on the other floors.
  • Some of the circular or curved buildings have irregular polygons, and a few buildings are slightly overlapping, including underground garages.

Import type

The local OSM community in Norway has discussed the import plan and local contributors will carry out the import.

Data preparation

Data reduction and simplification

The source data is transformed into geoJSON files by a python program, building2osm.The following modifications are done to the building polygons:

  • Duplicate nodes at the exact same position are being removed.
  • The building 2D polygons are derived from a richer 3D model which also contains shapes of roof tops and building parts. These extra building elements are not available in the data set, but nodes have been kept in the source polygons for each intersection with the extra elements. For example there are extra nodes at the top/end of a gabled rooftop. These extra nodes are being removed to avoid clutter in OSM if they are on an (almost) straight line.
  • The source polygons have almost 90° corners (where applicable), but not exactly 90°. These polygons are rectified / orthogonalised by the program. Connected buildings are rectified as one group. Multipolygons are also supported. Very short walls of less than 20 cm are removed if they are located on an (almost) straight line. Short walls of less than 1 metre are rectified with a higher corner angle tolerance due to lower tracing quality in the source data for short walls. Rectification of a building is aborted if the resulting polygon have any node which is relocated more than 20 cm.
  • Curved walls are identified and only gets a light simplification (tighter tolerance for removing nodes).

The removed nodes may be inspected with the -verify option of building2osm (please see below table in next section).

Tagging plans

The buildings are tagged according to this table:

Feature Values OSM tagging Comment
Building number E.g. "9399828" ref:bygningsnr=* The offical and unique building number used in the Cadastral registry ("Matrikkelen"). Will be used for later updates.
Heritage building true/false heritage=yes The building has an official status and protection as a heritage building. No further information provided in the dataset.
Building type Codes, e.g. "321" building=* Please see the Excel sheet conversion table in building2osm for tagging. If no tagging is specified, building=yes will be used.
Building levels Level codes, e.g. "H03" building:levels=* The highest number of "H" (main levels) + "U" (ground levels) in the address source data. "K" (underground/basement) is not used. Levels must be 2 or higher to be used because "1" is not reliable. Only used for residential or mixed-use building types.
Roof levels Level codes, e.g. "L01" roof:levels=* This highest number of "L" (roof/loft levels). Few cases.
The following keys are for information only during import and should not be uploaded to OSM:
Building type Codes, e.g. "321" TYPE=* Building type in the source data. Code which describes the usage of the building according to the conversion table in Excel.
Status
  • RA: Rammetillatelse
  • IG: Igangsettingstillatelse
  • MB: Midlertidig brukstillatelse
  • FA: Ferdigattest
  • TB: Bygning er tatt i bruk
  • MT: Meldingsak registrert
  • MF: Meldingsak fullført
  • GR: Bygning godkjent, revet eller brent
  • IP: Ikke pliktig registrert
  • FS: Fritatt for søknadsplikt
STATUS=*
  • RA. Permit for construction plan (usually no polygon)
  • IG: Permit for construction work
  • MB: Temporary permit to use building
  • FA: Construction completed and approved
  • TB: Building in use
  • MT: Note of construction registered
  • MF: Note of construction completed
  • GR: Building approved, demolished or burnt
  • IP: No obligation to register
  • FS: Exempted for registration
Date E.g. "2021-02-13" DATE=* Date of last modification in source. Could be used to identify recent updates in source.
Sefrak id E.g. "1151-101-73" SEFRAK=* Id in Sefrak (register for old buildings before 1900). Only included when using the -original option in building2osm. Not imported because ref:bygningsnr=* is also a key in Sefrak.

When the import files are generated with the -verify option, the following additional keys are provided for reviewing rectification and simplification:

  • verify_rectify=* – Marks rectified buildings. Value is maximum relocation of nodes in metres.
  • verify_group=* – Marks groups of connected buildings which have been rectified. One building in the group is marked. Value is number of buildings in group.
  • verify_short_corner=* – Marks buildings where a short wall has been rectified beyond the conservative tolerance for 90° corners. Value is corner angle in degrees.
  • verify_short_remove=* – Marks buildings where a very short wall on (almost) straight lines has been removed. Value is length of wall in metres.
  • verify_curve=* – Marks buildings where a curved wall has been identified. Value is number of nodes in curved wall.
  • verify_simplify=* – Marks buildings where nodes have been removed to simplify building polygon, but no rectification. Value is number of removed nodes for curved buildings, and angle of removed node for non-curved buildings.
  • remove=yes – Marks nodes which have been removed.

These extra keys are not included in the files used for importing.

Changeset Tags

When uploading to OSM, the changesets will be tagged with:

description=Kartverket building import for <municipality>
source=Kartverket
source:date=20xx-xx-xx

Data transformation

A python program, building2osm, has been made to generate files for the import. The program handles the following tasks:

  • Loads data directly from Kartverket based on input name of municipality.
  • Matches the three data sources (building types, polygons and addresses/building levels). Address coordinates must be located within the building polygon for building levels to be included.
  • Tags building=* and building:levels=* according to conversion tables.
  • Rectifies buildings where applicable, simplifies polygons (redundant nodes).
  • Saves a geoJSON file which may be opened in JOSM with the OpenData plugin installed.

Data merge workflow

Team approach

Import will be carried out municipality by municipality by members of the local community in Norway. We will try to recruit users with local knowledge to import their own municipality. A progress page has been created to avoid conflicting imports. The number of buildings and progress percentage are being updated via a tool in building2osm. This import is expected to go on for several years. The approach is to just make the import files available for anyone who is interested rather than pushing to get to completion.

References

The Norway Orthophoto ("Norge i Bilder") imagery will be used to check locations. The Building overlay WMS in JOSM may also be used to check the data source.

A JOSM style for building types has been designed for this import, and may be downloaded here. Also, a data validator for the import may be downloaded here.

Workflow

Please also consider the semi-automated method in the next Conflation section.

  1. Enter your user name in the progress page for the relevant municipality to avoid conflicts, or skip the municipality if it has already being imported by someone else.
  2. Download all buildings for the municipality in JOSM and store in a file for reference and for correcting mistakes during the import.
  3. Carefully review existing buildings in OSM for the chosen municipality to consider if an import is at all needed, or to determine which areas needs an import.
  4. Get a generated geoJSON file from this folder, generate one using building2osm or ask user NKA to get one.
  5. Open the import file in JOSM. Delete all buildings with only one node (no polygon) or copy them to a separate file for later. Also delete the extra keys with capital letters if you are not going to use them. Then select what to import.
    • Select a smaller section of the municipality for each session of import to be able to carry out the import with high quality. For example select 100 - 1000 buildings for each session. You may select all buildings in a city subdivision ("bydel") by downloading the relevant admin_level=9 relation from OSM, then use the Selection->All inside function in JOSM to get all buildings within that boundary.
    • Avoid districts which already have high quality building polygons or which is close to complete. For example, do not import buildings in downtown Oslo inside Ring 3, except merging the ref:bygningsnr=* tag.
    • Copy the selected district to the layer with existing OSM buildings, then delete this selection of buildings from the import file to keep track of what is left. It is important that the import buildings are in the same layer as the existing OSM buildings, otherwise you will get a large number of duplicate nodes in the next step.
  6. Use the Conflation plugin to identify matches between existing buildings in OSM and new buildings from Kartverket.
    • Walk through the matches and accept or reject conflation. Make sure that existing tagging is kept, and avoid conflating buildings which contain 3D tagging (for example building:part=* or roof:shape=*).
    • Change the suggested building type if it seems to be wrong based on background orthophotos or other OSM objects at the location.
    • Consider removing building:type=* if you come across it.
    • Refine building polygons manually if needed, for example circular buildings may be improved with the Align Nodes in Circle function (shortcut O) in JOSM, and parts of buildings or larger building groups may be rectified with the Orthogonalise shape function (shortcut Q).
    • Please observe that underground or basement garages will often overlap with other buildings. You may consider tagging them as amenity=parking + parking=underground + access=private instead of building=*.
    • In a few cases, high buildings may need an extra building:part=* polygon if for example a central tower has more levels than the supplied building polygon.
    • Please do not delete existing buildings, but rather replace them with the new import buildings. This will help in preservering existing tagging.
  7. These functions in the More tools menu of JOSM will be useful during the import:
    • Replace Geometry for swapping an existing building polygon with a new one.
    • Add nodes at intersections for resolving overlapping buildings.
    • Split Object for dividing a building into two parts.
    • If you do not have the More tools menu, you will need to select the UtilPlugin2 plugin.
    • When splitting a building, you may copy the ref:bygningsnr=* tag to each building. When joining buildings together into one building, you may also join the ref:bygningsnr=* tags with ; between each building number. This way, it will be possible to keep track of updates later.
  8. Use the JOSM Validation window and search to identify and handle conflicting buildings, for example overlapping buildings which have not been matched with the Conflation tool. If in doubt, keep the existing building.
    • Please pay attention in particular to validation problems for: Overlapping buildings, Building inside buildings, Self-intersecting ways, Unclosed ways - building, Ways with same position and Building duplicated nodes.
    • You may want to use the To-do plugin in JOSM to walk though all buildings identified by Validation as overlapping, to handle conflicts between new and existing buildings (first select the overlapping ways in the Validation window, then search for - modified and click Find in selection in the Search window.
    • Search for buildings with 3D tagging to control that they have not been distorted, for example building:part=* and roof:shape=*.
  9. Delete any source=* tags on the conflated buildings, since there is now a new source (which is tagged on the changeset, not on the building).
  10. Upload to OSM after all errors have been resolved. Tag the changeset as described above.

Conflation

Conflation is described in the workflow above. An optional script, building_merge, has been developed for a conservative automation of the clear and non-conflicting cases of conflation, leaving the rest for manual conflation. Here is the workflow for the script:

  1. You need to have Python installed on you computer, but no other dependencies are required.
  2. Run python3 building_merge.py <municipality name> from the folder were you have the geojson import file stored.
    • The script will download all existing buildings from OSM and conflate them with the import buildings.
    • Buildings will be conflated if all the closest points between them are not more than 10 meters apart (Hausdorff method). Tagged buildings must be within half that range. Also, the buildings cannot differ more than 50% in area size. Furthermore, the two buildings to be conflated must be each others best match.
    • If the building=* values are not within similar categories, an OSM_BUILDING=* tag will be included to indicate that a building type conflict must be resolved.
  3. Load the merged OSM file into JOSM and complete the rest of the conflation manually, first with these 4 steps (please note that you need to click the Validation button to get the two first validation selections below - it is not sufficient to click the Upload button to get full validation):
    • Overlapping buildings after validation: Click on this selection in the Validation window and search for - "ref:bygnigsnr" with Find in selection chosen in the Search window. Walk through all overlaps with the To-do plugin and resolve. Then validate again and resolve the remaining overlaps.
    • Building inside buildings after validation: Click on this selection in the Validation window and search for type:node with Find in selection chosen in the Search window. Walk through all cases with the To-do plugin and resolve. Then validate again and resolve the remaining cases.
    • Conflicting tags: Search for OSM_BUILDING=*, walk through all cases with To-do plugin and resolve. Please also check the building type for church vs chapel.
    • Remaining existing buildings: Search for building=* -modified to check if any of the remaining existing buildings from OSM needs to be adjusted. Some of these buildings may have had a large offset and should be merged with a new import building.

Please also check the other verification steps in the Workflow section above.

The import process described above usually takes 1-2 hours for an average size municipality. If thousands of buildings already exists in OSM for the municipality, it is recommended to first split the import file into smaller post districts using the municipality_split.py script, and then process each district separately.

Updates

Possible future extensions:

  • Kartverket provides a daily feed of new and modified buildings, which could be used in the future for providing suggestions for new additions or edits.
  • A future extension could be to include more data about the heritage buildings, such as name, construction date etc., which is available from a different source.
  • Kartverket has building heights, roof shapes etc for all buildings, and they could be included if they become available in the future.

All of these updates are possible via the ref:bygningsnr=* tag.

See also