Data Quality Approach LAC

From OpenStreetMap Wiki
Jump to navigation Jump to search

An approach to improve OpenStreetMap data quality in LATAM

Please comment on this document.

OpenStreetMap (OSM) is a collaborative online mapping project that allows users to create, update and share geographic data worldwide. As a result, the quality of data in OSM can vary according to the thematic focus and knowledge levels of the mapper. Similarly, the parameters for determining data quality depend on the focus and purpose for which the data was created, so in this document we will focus specifically on the elements that apply to all OSM data in general.

There are a number of commonly encountered problems that should be avoided. Although there is a lot of high quality data available in OSM, it is also possible to find incomplete or inaccurate data, especially in less developed or less populated areas.

This document seeks to be a guide that contributes to the improvement of the quality of existing data and facilitates the construction and incorporation of quality information.

Objectives:

  1. The main objective is to provide the main tools to verify and identify data quality problems in the region and to generate capacities to solve these problems.
  2. Additionally, we want to provide guidelines that should be taken into account to improve the quality of the data generated in group mapping projects developed remotely and field data collection activities.
  3. To provide a practical guide for new mappers to generate quality data avoiding common mapping errors.

Target population

Community members of different levels of mapping knowledge (beginners, intermediate and experts), governmental organizations, private sector entities and non-governmental organizations and in general any organization that participates in OSM by organizing group mapping projects.

Our Approach

What is data quality?

Data quality may vary by geographic region and the level of the contributors activity in that area. Some areas may have more active taxpayers and, therefore, a greater amount of accurate and up-to-date data, while other areas may have fewer taxpayers and, therefore, less data available.

However, there are some parameters to determine the quality of the data entered in OpenStreetMap among which the most important are:

Positional Accuracy: The information entered in OSM must be located in the correct place in order to improve its usefulness, this problem can occur widely due to the use of poorly georeferenced base information and because of the scale used at the time of mapping.

Presicion posicional.JPG

Thematic accuracy: OSM uses a conventionally established collaborative development data model, so the correct use of labels is fundamental to guarantee the quality of the information and the correct differentiation of the elements added to the map.

Tags.JPG

Completeness: From the base information used, it is possible to identify a large number of elements to be entered into the map (buildings, bodies of water, rivers, roads, etc.), this parameter may be relative since there are projects that only focus on some of these elements, so this element will depend specifically on the project being worked on.

Completitud.JPG

Consistency: The data entered must comply with established topological rules and free from errors;  such as crossing or overlapping elements that affect the optimal use of the information.

Consistencia.JPG

Temporal accuracy: It is important to keep in mind that all images do not have the same update date, so the mapped elements may have been generated on an outdated image.

Presicion temporal1.JPG Presicion temporal2.JPG

The combination of the previously described elements can give us elements of judgment to evaluate the quality of the data in a specific region.

Data quality assurance

Quality assurance tools

There are some tools and processes that shall be deployed to automatically and manually evaluate the previously mentioned parameters.

Automatic validation tools

JOSM Validator: JOSM has a validator that easily identifies labeling and consistency errors in a selected data being verified or newly added.

iD Editor Validator: Like JOSM, iD Editor also has a validation tool to detect mapping errors before uploading changes to the map.

Osmose: This is a tool that through a viewer shows the errors found such as duplicate nodes, incorrect labels, crossing features and highways that do not connect to each other.

Keep Right: This is a tool that uses a web viewer to identify errors mainly related to roads.

MapRoulette: It is a web application that according to challenges created by contributors identifies mapping errors of all kinds. The mechanics are "game-like" which makes it ideal for beginner mappers looking for a friendly tool to contribute to within their regions.

Turn Restrictions and Restriction Validator: Web viewers that are specifically designed to identify the turn restrictions mapped in OSM, pointing out the restrictions that have some topological inconsistency.

Mapswipe: is a mobile app designed to generate maps of remote and vulnerable areas using satellite imagery through mass user participation. The objective of MapSwipe is to quickly identify and mark elements of interest, such as buildings, roads or other infrastructure, in areas where detailed mapping is scarce.

Manual validation processes

In order to identify accuracy and completeness problems, in many cases it is necessary to use procedures other than the tools described above.

- Comparison with external sources: It is advisable to compare OpenStreetMap data with external sources of information, such as satellite images, aerial photographs or official map data, in Latin America and the Caribbean there are multiple open data portals. This will allow identifying possible inconsistencies or errors in OSM data and correcting them. In Latin America and the Caribbean region, there are several open data portals that use geoservices to provide official information that can be used to improve the quality of the data entered in OSM; geoservices are a fundamental source of information to obtain data that cannot be extracted directly from an image, such as toponymy.

- Manual Revision: This type of procedures require advanced knowledge in techniques such as photointerpretation and local knowledge, this will give the possibility to recognize elements on the map that a novice user cannot easily identify.

- Use of GPX Tracing and Strava heat maps: The information provided by these tools are ideal for checking the accuracy of OSM data and for entering unmapped road data.

- Mapswipe: verification projects so that contributors, from the cell phone, can validate one by one the mapped objects in OSM.

Data for quality assessment

As previously mentioned, there are several issues in the quality of OSM data, the complexity of these issues is very diverse, however, some issues that can be reviewed with the help of OSMose.

  • Bad tag key: Tags that are not yet properly documented in the OpenStreetMap Wiki.
  • Duplicated node: More than one node in the same position.
  • Geometry: Wrong geometry type on a particular object.
  • Highway: Problems related to roads including hanging highways, crossing ways, labels and their values.
  • Highway crossing: Problems related to the highway crossing tag.
  • Incompatible tags: Wrong combination of tags
  • Objects overlap: Objects that are connected with others in an illogical way.
  • Orphan nodes: Nodes disconnected without tags
  • Overlapping building: Intersections between buildings

How to ensure data quality in events and mapping activities?

Organized mapping projects are becoming more and more common and are led by different organizations. These activities are very important because they are the scenarios in which more information is incorporated into OSM. For this reason, some steps must be followed to ensure that the data generated are of the required quality.

Mapping activities can be of different types (remote, field or mixed). The steps described below do not always apply to all types, so as a project coordinator you must decide which steps apply to each type of activity.

Pre-event

Pre-event

  1. Define a Hashtag for the project, this will be very useful to track the edits and changesets that are made during the mapping activity, in case of using a Tasking Manager from the creation of the project with the purpose of generating it automatically for the editions generated within the platform, It is important that this Hashtag is not generic and, if possible, that it includes specific elements of the project so that it is not repeated in previous or subsequent projects..
  2. Communicate with a regional community leader and maintain constant communication of the progress of the project, participating in local telegram channels will allow you to know mapping rules of the region and have constant feedback from someone with local knowledge and even get people interested in participating in the project.Clean the
  3. existing data in the region where the project will be developed, it is always better to work on an error free canvas, however, you must be respectful of the work of others, as stated throughout the document the information stored in OSM is generated for multiple purposes, so you must be sure when making any modifications on existing data on the map. If you have any suspicions about the information in the map try to communicate with the user who generated the data and wait for a response before making a final modification (communication is essential to maintain a good atmosphere in the OSM community).
  4. Socialize the data model that will be used in the project, this will give an identity to the project and will generate a homogeneous mapping within the participants in the project, this model must be aligned with the tags registered in the OSM WIKI.
  5. Perform a verification of the free information available, consulting the open data portals of the region will allow to have secondary information that can be used to generate information with higher quality.
  6. Create a space in the WIKI where general aspects to be taken into account by the project participants are described, elements such as the hashtag, main tags, methodology and other important elements must be documented, there will be information that will be entered during the development of the project.
  7. If possible, obtain a list of the participants with their OSM user name, this will be useful to follow up on the editions developed within the framework of the project.
  8. If the project has remote mapping activities, consider the use of the Tasking Manager, this tool allows you to keep a better control of the worked areas, this mechanism makes it easier to assign tasks to different mappers, it will make clear each new part that needs to be mapped allowing the mapping simultaneously, develop validation activities and obtain partial and final statistics. It will make clear each new part that needs to be mapped allowing simultaneous mapping, develop validation activities and obtain partial and final statistics of the edits made in the project (HOT's Tasking Manager can be used by external organizations upon request).
  9. Generate previous training sessions, in these sessions all the previously mentioned items should be socialized and will allow the project participants to have a base information to generate quality data.

During the event

It is usually not easy to track very short events, one day, for example, so these recommendations are applicable for projects that have enough days for the manager to take corrective actions to ensure the quality of the data generated.

1. Monitor discussions in change sets and discussions in the task manager. If there is any discussion about your mapping activity, provide an appropriate and timely response.

2. Monitor edits related to geometry mapping and labeling by participants. There are a number of tools to help you keep track of this:

- OSMCha filter A tool with an extensive amount of changeset filtering options including detection of suspicious edits over a time period, location, hashtag, user, or mapping activity.

- Google Alert Google's free tools for notification and change detection services. The service automatically sends emails to the user when it finds matching results with the settings made by the user.

- Changeset discussion OSM contributors can have discussions about an issue, directly on the OpenStreetMap platform. This discussion is public which allows collaboration with other mappers.

3. Be aware of any questionable edits by participants.

After the event

  1. Validate the mapping activity. Regardless of whether the activity is performed in the field or remotely, verification is required whenever OSM data is used.
  2. Ensure that all change set discussions and task manager are documented and responded to.
  3. Update public documentation related to the activity.

Transparency

Be part of the OpenStreetMap community by communicating the quality and usability of existing data in priority regions and impact areas. For public information, publish documentation on GitHub, OpenStreetMap Wiki and other available sites. You can organize an event and formalize mapping conventions by creating an OpenStreetMap Wiki page on your country's website.

You can also actively participate in the changeset discussions in the nations of your choice by subscribing to the latest changeset discussion. There are numerous valuable technical discussions on this site.

As part of the OpenStreetMap community, as well as being proactive with existing quality assurance tools, you could help the development of community-driven tools. You can contribute by submitting ideas or reporting bugs to the development team. Eventually, more data quality improvement tools will be available that are tailored to the needs of the community.

What is your opinion of this document?

We are always looking for ways to improve this document. Please share your comments with us.