PT TEC Wallonie BE Import

From OpenStreetMap Wiki
Jump to navigation Jump to search

About

This import project page is about the TEC public transport import. They released their data under a Creative Commons – Attribution 4.0 International (CC BY 4.0) license. (http://opendata.awt.be/dataset/tec)

This dataset contains all the stops and the timetables for buses and trams in Wallonia (and some in Flanders and the Brussels and German speaking region).

Import Plan Outline

Goals

The goal of the import is to use this dataset in mapping activities. We are not attempting a blind import, all data has to be seen by human eyes before it can appear in the OSM data.\\

The nodes have to be moved to the right locations and the names need to be double checked.

Schedule

There is no fixed schedule. It takes however long it takes. Updated data coming in from upstream is continuously integrated.

Import Data

Background

Import Type

Each stop and each route is vetted before it gets added to OSM. No automatic import is to take place. The generated OSM file has an extra line:

<osm version="0.6" upload="no" generator="Python script">

which causes an additional message from JOSM if somebody would try to upload all of the data at once.

The file is only meant as an aid to facilitate adding/integrating stops manually, not for automatic upload/import.

Data Preparation

Data Reduction & Simplification

The technical nitty gritty for converting the data can be found here:

https://wiki.openstreetmap.org/wiki/WikiProject_Belgium/De_Lijndata

I'm adapting what is found there to the peculiarities of the data coming from TEC. De Lijn provided a dump of their tables. TEC provides a more 'intelligent' format, which unfortunately makes it harder to load it into PostGIS.

The latest version of the resulting osm file can be found here:

https://dl.dropboxusercontent.com/u/42418402/TEC.osm.zip

In this file, all stops which are not in OSM yet, get an odbl=new tag. This has nothing to do with odbl, but those tags will get removed automatically before JOSM uploads the data.

The file contains all stops, for each stop a route_ref has been calculated from the timetable information which is part of the data. To select a group of stops in order to add route relations in the next step, this search expression (RE) can be used:

RR route_ref="(^|.+;)26(;.+|$)" inview odbl=new

26 gets replaced with the route number you want to work on.

Then copy/paste the selected stops to your work layer and reposition them one by one, checking the names for abbreviations which weren't converted properly and add zone information.

In order to add the route relations, the member stops need to be uploaded first, then the file needs to be saved and a script needs to run to update the local DB.

After that createOSMrouterelations.py can be used to create all route relations which have sequences stops in different order. In case of telescopic lines, only the longest sequence of stops gets a route relation.

Tagging Plans

stops
tag value
highway bus_stop
name ongoing effort to expand abbreviations automatically and to streamline/generalise others like Eglise -> Église, Ecole -> École, Chssée -> Chaussée
operator TEC
ref internal ref number of TEC
zone 4 digits instead of the 2 visible on the poles
route_ref 1;3;20a;708
public_transport platform
bus yes
tram yes (where applicable)
Remarks

When a stop is served by more than 1 operator (common in Brussels region) 1 node per operator is used to facilitate automated QA. Unfortunately TEC is divided in entities, which all assign their own ref codes. So to keep things manageable, each of these entities is considered a separate operator. All these stops are combined in a stop_area relation.

In Brussels this leads to 4 nodes for the same stop, when the stop is served by MIVB/STIB, De Lijn, TEC Brabant-Wallon and TEC Charleroi. It looks a bit odd, but nodes are cheap.


route relations
tag value
from Bruxelles Midi
name TEC W Bruxelles Midi - Waterloo
operator TEC
ref W
route bus (or tram)
to Waterloo
via needs to be added manually

ways: get no roles and form an ordered sequence from beginning to end (they need to added manually, although I did write a script which runs inside JOSM which can find the nearest way to a stop)
stops: get a platform role automatically, this needs to be changed to a more correct role if needed for stops where one can only board or get off.


route_master relations
tag value
name Bruxelles Midi - Waterloo
operator TEC
ref W
route_master bus
type route_master

Changeset Tags

source = TEC;Bing2011

Data Transformation

https://wiki.openstreetmap.org/wiki/WikiProject_Belgium/De_Lijndata

Data Transformation Results

Data Merge Workflow

Tedious manual labour

Team Approach

If people want to join in, send me a message (Polyglot), I'll explain what you need to know during a few hangouts. This usually takes several hours...

Workflow

Dedicated upload account

Every stop gets vetted, the data from upstream serves as one of many references, integration/conflation is manual labour.

Conflation

Conflation has to be done by each individual contributor. It's better to let a human decide on this.

QA

For the stops I have a script which generates output in wiki format where names and route_refs are compared. JOSM RC is used to make it easy to upload them.

For routes, it's work in progress. QA and maintenance on them would be a lot easier if it were possible to use route segments.