Talk:Key:gtfs:trip id:sample

From OpenStreetMap Wiki
Jump to navigation Jump to search

@ToniE: -- you mentioned on the GTFS page that this key is needed to link a route relation to a specific set of stops (as opposed to where the bus goes). I'm confused by this, as it seemed like the shape_id value would be sufficient to identify a set of stops, unless there are multiple different sets of stops that use the same shape. Could you clarify this further? JesseFW (talk) 16:42, 11 September 2023 (UTC)

a GTFS trip identifies a single trip of a bus, ...
* GTFS file trip.txt
** specifies the trip_id
** links a trip to a route_id in routes.txt
** links a trip to a service_id in calendar.txt and calendar_dates.txt (on which days the trip will be done)
** if shapes exist, shape_id links the trip to a path that a vehicle travels
** specifies some more information like the headsign
for each trip_id there are several entries in the GTFS file stop_times.txt
* trip_id to identify which trip the entry belongs to
* stop_id, the id of a stop of the trip
* stop_sequence, the order of the stop for this particular trip (1,2,3,...)
* arrival_time, departure_time are other entries, not so relevant for OSM
for detailed information of stops, the GTFS file stops.txt exists
* stop_id of course to link it to trips via stop_times.txt
* stop_name
* stop_lat and stop_lon, the position of the stop (platform)
a GTFS shapes identify the path that the vehicles will have to travel along to get from stop to stop
* GTFS file shapes.txt identifies a path (GPX route)
** several trips use the same path/shape at different departure times (every 10 minutes between 7 AM and 7 PM?)
** for the passengers the path/shape is not so important as long as departure and arrival stops are on the path
** the bus driver must know which roads to take to get from stop(n) to stop(n+1)
GTFS stops are usually platforms beside the road (public_transport=platform in PTv2 jargon)
GTFS shapes define how buses will travel along roads to come close to the stops (OSM's PTv2 defines public_transport=stop_position for this purpose)
@ToniE: That all makes sense (and someone should copy the above onto the GTFS page because it is very useful!), but it still seems like PTNA should be able to go from a gtfs:shape_id=* (and gtfs:feed=* or a fallback) to a list of stops, so long as all the trips linked to that shape_id have the same stop sequence. It would be a nice to have a warning to PTNA about type=route relations with just gtfs:shape_id=* where that invariant doesn't hold. But for ones where it does, leaving out the gtfs:trip_id:sample=* seems better, since the trip_ids seem to get changed more often (at least, for the MBTA). JesseFW (talk) 22:51, 11 September 2023 (UTC)
PTNA does not use the GTFS information given in the CSV data and/or the OSM relation to do any checks against OSM data - that's for further development. But you're point here should be included, yeah. --ToniE (talk) 10:16, 12 September 2023 (UTC)
But it does use it to show linkages... but it seems like it doesn't link OSM routes to GTFS shapes (yet), only OSM route_masters to GTFS routes. Unless I've misunderstood something (which is plausible!). JesseFW (talk) 11:33, 12 September 2023 (UTC)
No you're right. That's caused by some blinkers in my head. My local 'network' DE-BY-MVV does not provide "shapes". I'll have a look at the code. --ToniE (talk) 11:54, 12 September 2023 (UTC)
* CSV data uses feed and route_id only - that's OK and sufficient
* gtfs:trip_id and gtfs:trip_id:sample in an OSM relation are linked to the single-trip.php page using parameter &trip_id= - presenting trip and shape related information on the map: example
* in the absence of gtfs:trip_id and gtfs:trip_id:sample, gtfs:shape_id in an OSM relation is linked to the single-trip.php page using parameter &shape_id= - presenting also trip and shape related information on the map: example - same amount of information via different id
--ToniE (talk) 14:23, 12 September 2023 (UTC)
Thanks! I've identified an example where the existing logic falls down, and made a issue for it, here: https://github.com/osm-ToniE/ptna/issues/139 JesseFW (talk) 22:20, 12 September 2023 (UTC)
And I've copied it now: https://wiki.openstreetmap.org/w/index.php?title=GTFS&diff=2594967&oldid=2594820 - JesseFW (talk) 23:01, 11 September 2023 (UTC)
Interestingly, for the MBTA, all the bus routes satisfy the invariant, but many of the commuter rail routes don't (along with one of two of the subway routes). I analyzed this with the following sqlite query:
select count(distinct stop_list) as "#_stop_lists", route_id, shape_id from (select distinct trip_id, group_concat(stop_id) over (partition by trip_id order by stop_sequence rows between unbounded preceding and unbounded following) as stop_list from gtfs_stop_times) as x join gtfs_trips using (trip_id) join gtfs_routes using (route_id) group by route_id, shape_id having count(distinct stop_list)>1;
So having PTNA distinguish these cases would certainly be useful. JesseFW (talk) 03:45, 12 September 2023 (UTC)
Thanks for pointing to that and for the intricated SQL query - I'm an SQL novice, so I'm quite impressed.
I had to look up various bits of sqlite-specific weirdness to get it working, so thanks. :-) JesseFW (talk) 11:33, 12 September 2023 (UTC)
I think PTNA's GTFS analysis implements a check which comes quite close to what you're proposing (if I understand that correctly), w/o using shapes at all.
* Kindly have a look at the "PTNA info comments" (last column) on the CR-Franklin route_id or the summary for MBTA. Is this something you're looking for? --ToniE (talk) 10:12, 12 September 2023 (UTC)
That is very similar, but it's noticing inconsistencies between stop names and stop_ids, rather than between (a sequence of) stop_ids and (single) shape_ids. JesseFW (talk) 11:33, 12 September 2023 (UTC)
You're right here again: I aggregate trips by stop_id (different stop_ids => different trips) to a single representative trip (trip_id:sample) and neglect whether they have different shape_ids. I'll check the code. --ToniE (talk) 11:54, 12 September 2023 (UTC)