User:Sjwhitak/RIPTA Import Plan

From OpenStreetMap Wiki
Jump to navigation Jump to search

The RIPTA Import is an import of RIPTA GTFS dataset which is the bus routes covering Rhode Island. The automatic import is currently (as of 2024-03-07) completed. Currently, the bus routes are being manually having the bus stops added.

Goals

Add the bus stops in Rhode Island so I don't have to use Google Maps to find when the next bus arrives.

Schedule

I could finish this in a few days. I just want confirmation that my procedure is solid and I don't want mess anything up on OpenStreetMaps.

Import Data

Background

Data source site: https://www.ripta.com/mobile-applications/
Data license: Public domain.

Hello, I am interested in working with the GIS data related to the bus stops throughout Rhode Island using the GTFS data found here: https://www.ripta.com/mobile-applications/ Mainly, importing the bus routes, stops, and schedules into OpenStreetMaps. I was directed to this email from [redacted], hopefully this is a good contact to email.

Since this is a public service, this data is presumably public domain, but I would like confirmation to make sure I'm not doing anything I'm not supposed to. Could I have confirmation this data is publicly available? And if it is stated somewhere publicly (such as a website) could you point that out to me? That would be nice for records' sake.


Awesome that you want to put RIPTA data into OpenStreetMaps! Yes, this data is publicly available and you’re welcome to use it.

In case you’re not aware, RIPTA makes planned service changes generally three times a year (January, June, and August/September) and when we make those changes we also update the GTFS data on our website. Most of these changes are not drastic – adding or removing a bus stop, small route deviations, retiming of schedules, etc. – but to be most accurate you may need to occasionally update the data or at least note the publish date.

Import Type

I do not trust my coding to automate updates and this will simply be a one-time update. If there are significant changes that RIPTA has made to the bus routes, then I can manually plot each one.

Data Preparation

Data Reduction & Simplification

The only risk in this import is using GTFS-OSM-Validator which may miss bus stops that are already mapped in OSM. The range for automatic detection of bus stops in GTFS-OSM-Validator is very poor; expanding too large blurs everything together, and making it too little just removes everything. My plan to mitigate such risk is to simply not search for any pre-existing OSM bus stops using GTFS-OSM-Validator and instead manually search every bus stop in JOSM. There are a total of 3629 bus stops, so I have no problem simply manually checking every one in JOSM.

Changeset Tags

Unfortunately, I am unfamiliar with what a changeset is, but these are the tags that I am adding to each bus stop:

Key Value
bus yes
public_transport platform
gtfs_stop_code code provided by GTFS
name bus stop provided by GTFS
route_ref bus route provided by GTFS
network RIPTA
network:wikidata Q7320944
network:wikipedia en:Rhode Island Public Transit Authority
highway bus_stop
operator Rhode Island Public Transit Authority

I am unsure of what the purpose of the `gtfs_stop_code` and `gtfs_id` are, but they are used in GTFS so I presume there's a purpose for it.

Data Transformation

The Python script inserts the OSM-specific tags at each bus stop. This script may fail on other GTFS formats, but it worked for me with RIPTA's.

# This code runs through the RIPTA-GTFS dataset after running through gtfs-osm-sync 
# and updates tags for OSM.
import xmltodict
from itertools import compress

def specific_route(nodes, route):
    N = len(nodes)
    route_mask = [False] * N
    for i in range(N):
        for tag in nodes[i]['tag']:
            if 'ref' in tag.values():
                if route in tag['@v']:
                    route_mask[i] = True    
    return route_mask

def remove_dupes(list_of_dicts):
    # https://stackoverflow.com/a/41704996
    """Source: answer from wim
    """ 
    list_of_unique_dicts = []
    for dict_ in list_of_dicts:
        if dict_ not in list_of_unique_dicts:
            list_of_unique_dicts.append(dict_)
    return list_of_unique_dicts

def update_nodes(nodes):
    for node in nodes:
        
        # Remove white space from lat, lon
        node['@lat'] = node['@lat'].strip()
        node['@lon'] = node['@lon'].strip()
        
        # Search for broken/extra tags and remove them
        # or modify them.
        rm = []
        stop_id = -1
        for i in range(len(node['tag'])):
            if 'network' in node['tag'][i].values():
                node['tag'][i] = {'@k': 'network', '@v': 'RIPTA'}
            if 'gtfs_id' in node['tag'][i].values():
                rm.append(i)            
            # In my previous import, I used `ref_route` instead of `route_ref`
            # which messed with things.
            if 'ref_route' in node['tag'][i].values():
                rm.append(i)
            if 'ref' in node['tag'][i].values():
                node['tag'][i]['@k'] = "route_ref"
            if 'gtfs_stop_code' in node['tag'][i].values():
                # Adjust to "gtfs:stop_id" according to KevinMapsThings
                node['tag'][i]['@k'] = "gtfs:stop_id"
                stop_id = node['tag'][i]['@v']
        if len(rm) > 0:
            for r in sorted(rm, reverse=True):
                del node['tag'][r]
        
        # Add OSM-sppecific tags.
        node['tag'].append({'@k':'network:wikidata', 
                            '@v':'Q7320944'})
        node['tag'].append({'@k':'network:wikipedia', 
                            '@v':'en:Rhode Island Public Transit Authority'})
        node['tag'].append({'@k':'network:short', 
                            '@v':'RIPTA'})
        node['tag'].append({'@k':'highway', 
                            '@v':'bus_stop'})
        node['tag'].append({'@k':'operator', 
                            '@v':'Rhode Island Public Transit Authority'})
        node['tag'].append({'@k':'operator:short', 
                            '@v':'RIPTA'})
        node['tag'].append({'@k':'operator:wikidata', 
                            '@v':'Q7320944'})
        node['tag'].append({'@k':'ref', 
                            '@v':stop_id})
        node['tag'].append({'@k':'gtfs:feed', 
                            '@v':'US-RI-RIPTA'})   
        
        # If previous tags already exist, remove the duplicates.
        node['tag'] = remove_dupes(node['tag'])
    return nodes

data = xmltodict.parse(open('test.osc').read())

# I do not want to delete any nodes, so just in case I did, ignore them.
# I do not want to modify/create relations, so ignore those too.
data['osmChange']['delete'] = None
data['osmChange']['create']['relation'] = None
data['osmChange']['modify']['relation'] = None

# Add tags
data['osmChange']['create']['node'] = update_nodes(data['osmChange']['create']['node'])
data['osmChange']['modify']['node'] = update_nodes(data['osmChange']['modify']['node'])

# Save to XML file
open('test2.osc','w').write(xmltodict.unparse(data, pretty=True))
x = open('test2.osc','r').readlines()

# My stupidity from the previous edit required me removing this empty tag.
y = [line for line in x if '<tag k="route_ref" v=""></tag>' not in line]
open('test3.osc','w').writelines(y)

Data Merge Workflow

Team Approach

Solo, unless someone cares to help.

Workflow

The import is a multi-step, manual process.

  1. Download the RIPTA GTFS data from the RIPTA website.
  2. Use GTFS-OSM-Validator to compare the GTFS bus stops with current bus stops already on OSM.
  3. After exporting to a .osc file, I use a User:Sjwhitak/RIPTA_Import_Plan#Appendix Python script I wrote to update the tags to follow OSM standards for bus stops.
    1. NOTE: This was done so I didn't have to manually insert each of these tags in with JOSM. There might be some automated method in JOSM, but I couldn't find it.
  4. In JOSM, zoom in to every bus stop to see if I missed anything before uploading to OSM.
  5. Finally, connect the route lines that were laid out by njtbusfan
    1. NOTE: I personally don't know how to do this in JOSM so I'll do it manually in OSM.

BONUS: RIPTA contains bus times, shown on their website for each route, but also this data is in the GTFS database. Though, the GTFS-OSM-Validator does not handle bus times so I will need to write a second Python script to handle the GTFS bus times. Or I'll need to add an update to the GTFS-OSM-Validator to handle this properly.

I have the JOSM reverter plugin, so I can use that to revert changes.

Current state (2024-03-07)

  1. All bus stops are imported
  2. All bus routes are added
  3. Bus stops are NOT connected to bus routes yet
Bus route Status
Not completed R, QX, 6, 9x, 10, 12x, 13, 14, 29, 33, 76, 87
Bus route cleaned 1, 17, 18, 19, 20, 27, 28, 30, 31, 32, 34, 35, 40, 50, 51, 54, 55, 56, 57, 58, 59x, 63, 64, 65x, 66, 67, 71, 72, 73, 75, 78, 80, 92, 95x, 301
Completed 21, 22, 23, 24L, 60, 61x
Add to OSM 3, 4, 16, 68, 69, 88, 89, 203, 204, 231, 242, 281, 282, BB, PVD
Remove from OSM 3A, 3B, 8x, 49, 62

See also

The post to the community forum was sent on 2023-12-19 and can be found here