Import CDAU

From OpenStreetMap Wiki
Jump to navigation Jump to search

CDAU Import describes a proposal being developed to import the CDAU, a addresses dataset for Andalusia provided by the Institute of Statistics and Cartography of Andalusia (Instituto de Estadística y Cartografía de Andalucía)

Goals

The goal is to merge addresses from CDAU into the Spanish Cadastre/Buildings Import.

Schedule

Import Data

CDAU is the set of geographic data of roads and portals of Andalusia, with topological structure, which allows to place in the territory any geographical object (and its associated variables) that has a postal address, with an approximation at the portal level.

The basic entities that are maintained and updated with CDAU are the roads, sections of roads and portals in which the population resides (housing) or in which an activity is carried out (establishments or premises), including all population centers and the scattered ones.

Background

Data source site: http://www.callejerodeandalucia.es/portal/web/cdau/inf_alfa

Online sources: http://www.callejerodeandalucia.es/

Data license: CC BY 4.0

Link to permission: Explicit permission .

Attribution in OSM: In the Contributors page and in the changesets.

ODbL Compliance verified: Yes, with explicit permission.

OSM Data Files

The source files in CSV format can be opened directly in JOSM with the OpenData plugin and using the ETRS89 UTM 30 reference system (EPSG: 25830). No files have been prepared since the data are incorporated into the buildings within the workflow of the Spanish Cadastre/Buildings Import.

Import Type

Community import with manual review.

Data Preparation

Data Reduction & Simplification

Only addresses with the types "PORTAL" (house number) and "ACCESORIO" (accessory) are imported. Excluded are "DISEMINADOS" (diseminated) and "PUNTO KILOMÉTRICO" (kilometric point).

Tagging Plans

These are the fields for the addresses source file with a node for each addresses.

Sorce field Description OSM tag Notes
id_vial Road identifier N/A
ine_via Road identifier in INE (National Institute for Statistics) N/A
dgc_via Road identifier in Cadastre N/A Used to link with Cadastre addresses
tvian Type of road (short) N/A Five digits abbreviations.
nom_tip_via Type of road (long) addr:street addr:street = nom_tip_via + ' ' + nom_via. See #Places tagging
nom_via Road name addr:street addr:street = nom_tip_via + ' ' + nom_via
sobrenombre Nickname N/A
id_por_pk House number code N/A
tipo_portal_pk Type of house number N/A Only addresses with the types "PORTAL" (house number) and "ACCESORIO" (accessory) are imported. Excluded are "DISEMINADOS" (diseminated) and "PUNTO KILOMÉTRICO" (kilometric point).
num_por_desde Start house number addr:housenumber
ext_desde Start house letter addr:housenumber If present, addr:housenumber = num_por_desde + ext_desde, e.g. 15A
num_por_hasta End house number addr:housenumber If present, addr:housenumber = num_por_desde + '-' + num_por_hasta
ext_hasta End house letter addr:housenumber If present, addr:housenumber = num_por_desde + ext_desde + '-' + num_por_hasta + ext_hasta. e.g.: 15A-15B
bloque Block N/A
portal Entrance N/A
escalera Stairs N/A
refcatparc Cadastral Parcel Reference N/A Used to link with Cadastre addresses
txt_app Additional location data N/A
nom_tipo_agrupación Type of grouping N/A
nom_agrup Grouping name N/A
ine_nucleo Settlement code N/A
nom_nucleo Settlement / disseminated name N/A
ine_mun Code of the municipality N/A Used for query purposses. Maches with codes in Cadastre according to this tables [1] or [2].
nom_municipio Name of the municipality N/A
cod_postal Postal code addr:postcode
x X coordinate <node lat=* lon=*> Transformed from EPSG:25830 to EPSG:4326
y Y coordinate <node lat=* lon=*> Transformed from EPSG:25830 to EPSG:4326

This document adn glossary have been used for reference.

Places tagging

The addresses with certain values in the 'nom_tip_via' field are assigned to the addr:place=* tag instead of addr:street=*. The list of values is configured in the variable 'place_types_es' within the file setup.py.

Changeset Tags

Key Value
comment #CDAU_Import
source Instituto de Estadística y Cartografía de Andalucía
source:date 2018-02-01
type import
url https://wiki.openstreetmap.org/wiki/Import CDAU

Data Transformation

It will be used the CatAtom2Osm tool. The modificatio to download, read and merge the CDAU data was developed here.

Conflation with Cadastre

The two data sets have different number of addresses and may the house number for an address could differ. Each address in Cadastre is uniquely identified by the field 'localId', a string with this format PP.MM.VVV.N.CCCCC, like in 29.900.845.5.3350109UF7635S, where each part have this meaning:

Part Meaning Example CDAU field*
PP Two digits code for province 29 Fixed for each municipality
MMM Thre digits code for municipality 900 Fixed for each municipality
VVV Road identifier 845 dgc_via
N House number 5 Values could differ
CCCCC Cadastral parcel reference 3350109UF7635S refcatparc

* The values to link both sets are not present in all the data.

The CDAU addresses are considered more updated and prevail over those of Cadastre. For each municipality, they will be combined in the following way:

  • For each CDAU address we take the group of Cadastre addresses with matching values for 'dgc_via' and 'refcatparc'.
  • If this group is empty and the are no nearby Cadastre addresses, we take the CDAU address.
  • If the group has exactly an element, this is replaced by the CDAU address.
  • If the group has several elements, the nearest Cadastre address is replaced by the CDAU.

Conflation with OSM

The names of the streets are combined with those existing in OSM in a two-phase process through the software and manual review. In a first phase, for each set of addresses with the same road name, the program locates the street in OSM with the closest name in the vicinity. The software generates a conversion table between the source names and their match in OSM that must be reviewed manually. In this phase:

  • We detect the streets that have no name in OSM, those that have a name that needs to be corrected and those with doubts for the name. They will be checked with on the ground survey or searching street names plaques in the Cadastral [[ES:Fuentes de datos potenciales de España#Fotos de fachada|facade photos]. Corrections and new names to OSM are manually edited.
  • The incorrect pairings made by the program are detected and corrected.
  • The conversion of those streets whose addresses do not want to be imported is left blank.

In a second phase, the software incorporates the corrected names to the addresses and merges them with the buildings to be reviewed and imported. For the buildings that do not have an address in the CDAU, the Cadastre address is used.

Data Transformation Results

You can review some samples of the results for the city of Malaga in this repository. The 'address.osm' file contains the results for the entire city. This data is not imported, it is used only as a reference and to access the front photos through the link contained in the image tag (the Tag2Link JOSM plugin is required). The addresses are combined with the buildings by the CatAtom2Osm program, which generates files by blocks. The files 'u????.osm.gz' are an example of some of them.

Data Merge Workflow

We will follow the workflow described in the Spanish Cadastre/Buildings Import.

Team Approach

A manager by area will be responsible for the transformation, preliminary review of data and unification of street names.

Workflow

The manager will publish projects in the Task Manager open to participation.

Conflation

The program excludes those addresses that are already present in OSM. The addresses collected in the field have priority over the data to be imported.

QA

During the manual incorporation of the data into OSM, participants correct collisions of the two data sets and review each portal number with the facade photos.

Updates

When new data is published, the differences will be filtered and incorporated manually.

References

  • Thread about the license in the Spanish OpenStreetMap community.