--- title: "Transfer Tables" author: "Mark Padgham" date: "`r Sys.Date()`" output: html_document: toc: true toc_float: true number_sections: false theme: flatly vignette: > %\VignetteIndexEntry{Transfer Tables} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r DTthread, echo = FALSE} # Necessary for CRAN to avoid CPU / elapsed time ratios being too high nthr <- data.table::setDTthreads (1) ``` GTFS feeds may include a table of possible transfers between stops, and potentially between specific connecting services. There feeds are nevertheless optional, and many feeds omit transfer tables, making them difficult if not impossible to use for routing. The `gtfsrouter` package includes a function, [`gtfs_transfer_table()`](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html), which calculates and inserts a transfer table into a feed. ## Basic Usage This function can also be demonstrated with the very small feed included with this package, by first running [`berlin_gtfs_to_zip()`](https://UrbanAnalyst.github.io/gtfsrouter/reference/berlin_gtfs_to_zip.html) to create the feed in the temporary directory of the current R session. ```{r berlin_gtfs} library (gtfsrouter) f <- berlin_gtfs_to_zip () gtfs <- extract_gtfs (f, quiet = TRUE) ``` The transfers table of the feed looks like this: ```{r transfers-struct} gtfs$transfers ``` A transfers table is required for successful routing between different services, like with the following code: ```{r route1} gtfs_route ( gtfs, from = "Friedrichstr.", to = "Rosenthaler Platz", start_time = 12 * 3600, day = "Monday" ) ``` That trip requires a transfer, and is possible because the feed has a table specifying the possibility of transferring at the "S+U Gesundbrunnen Bhf" station. Removing the transfers table demonstrates what typically happens when attempting to calculate routes on feeds which lack this information: ```{r no-transfers} gtfs$transfers <- NULL gtfs_route ( gtfs, from = "Friedrichstr.", to = "Rosenthaler Platz", start_time = 12 * 3600, day = "Monday" ) ``` No transfer is able to be make, and so no route is found. [The `gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html) automatically calculates and adds a transfer table to a feed, enabling it once again to be used to generate routes: ```{r transfer-table-200} gtfs <- gtfs_transfer_table (gtfs, d_limit = 200) gtfs_route ( gtfs, from = "Friedrichstr.", to = "Rosenthaler Platz", start_time = 12 * 3600, day = "Monday" ) ``` That route is considerably faster than the original one, because it utilises a transfer (at "S+U Alexanderplatz") which is not in the original transfers table. That particular transfer highlights the most important caveat to using [the `gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html): that tables may and often will still require some degree of manual checking and adjustment to make them usable, as demonstrated in the following section. ## Extending transfer tables The above example first removed the transfer table by setting `gtfs$transfers <- NULL`. Applying [the `gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html) to a feed which already has a transfer table will extend that table by adding any additional transfers within the specified distance ([see below](#transfer-distances)). Any transfers already present in the transfers table will be retained as is, and the table will be extended by adding any new transfers not present in the original table. Transfers are calculated based on distance alone, and do not utilise any information on routes or trips. Some feeds, such as the [example Berlin feed included with this package](https://UrbanAnalyst.github.io/gtfsrouter/reference/berlin_gtfs_to_zip.html), include transfer information between specific routes and services, resulting in a transfers table in which station pairs are repeated numerous times for distinct combinations of routes and services: ```{r transfer-table-size} gtfs <- extract_gtfs (f, quiet = TRUE) # 'f' is the location generated above nrow (gtfs$transfers) length (which (!duplicated (gtfs$transfers [, c ("from_stop_id", "to_stop_id")]))) ``` ```{r transfer-table-size-data, echo = FALSE} n0 <- nrow (gtfs$transfers) n1 <- length (which (!duplicated (gtfs$transfers [, c ("from_stop_id", "to_stop_id")]))) ``` The table only has `r format (n1, big.mark = ",")` unique pairs of `from` and `to` stops, yet details `r format (n0, big.mark = ",")` possibles ways of transferring between these stops. Removing and re-generating these transfers gives the following results: ```{r transfer-table-regen} gtfs$transfers <- NULL gtfs <- gtfs_transfer_table (gtfs) nrow (gtfs$transfers) ``` The [`gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html) generates `r nrow (gtfs$transfers) - n1` additional transfers not present in the original feed, but as stated contains no information on transfers between specific routes or trips. Alternatively, using the function to extend the existing table, rather than re-generate it, gives the following result: ```{r transfer-table-regen2} gtfs <- extract_gtfs (f, quiet = TRUE) gtfs <- gtfs_transfer_table (gtfs) nrow (gtfs$transfers) ``` Note that each call to [`gtfs_transfer_table()`](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html) will only extend the table through adding any transfers not present in the previous table, and that no transfers will be removed. The following code uses the `d_limit` parameter [described below](#transfer-distances) to calculate transfer tables for increasing maximal transfer distances. ```{r transfer-table-extend} gtfs <- extract_gtfs (f, quiet = TRUE) vapply (0:10, function (i) { gtfs <- gtfs_transfer_table (gtfs, d_limit = i * 100) return (nrow (gtfs$transfers)) }, integer (1)) ``` The transfer table increases in size with each call, and retains all transfers generated from all previous calls. Because each call involves extending the transfer table, the table at each stage may be used for routing with transfers out to each specified distance. Subsequently reducing the transfer distance would then not be appropriate, as the table would still retain all transfers at previously-calculated (longer) distances: ```{r transfers-reduce-d_lim} gtfs <- gtfs_transfer_table (gtfs, d_limit = 100) nrow (gtfs$transfers) ``` In this case, the table would first have to be removed (`gtfs$transfers <- NULL`), and the recalculated at the specified distance. ```{r transfers-reduce-d_lim2} gtfs$transfers <- NULL gtfs <- gtfs_transfer_table (gtfs, d_limit = 100) nrow (gtfs$transfers) ``` ## Modifying transfer tables This section demonstrates that transfer tables generated using [the `gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html) often need to be manually modified to ensure sensible results. The second route generated above after using [the `gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html) includes a transfer at "S+U Alexanderplatz" from a route named "S7" to one named "U8". This train station is in fact one of the largest stations in Berlin, with the "S" routes using elevated platforms above ground, and the "U" denoting underground. The "U8" in particular is quite a long way underground, and while transfer in the indicated time of just under three minutes may be theoretically possible, it would likely require running, and ought not be generally presumed to reflect a viable transfer. Note further that the actual locations of these stations can be extracted with the following lines: ```{r route2} r <- gtfs_route ( gtfs, from = "Friedrichstr.", to = "Rosenthaler Platz", start_time = 12 * 3600, day = "Monday", include_ids = TRUE ) stns <- r$stop_id [3:4] # the transfer station IDs gtfs$stops [match (stns, gtfs$stops$stop_id), ] ``` The distance between them is: ```{r dist} s <- gtfs$stops [match (stns, gtfs$stops$stop_id), ] as.numeric (geodist::geodist (s [1, ], s [2, ])) ``` The transfer table algorithm presumes a pedestrian speed of 4km / hour, which is 1.11 m/s, meaning this distance should be able to be covered in just over 50 seconds. In reality, it would be impossible to run between these two stops in 50 seconds, primarily because of a need to traverse 3 or 4 vertical levels. We now demonstrate how to add more realistic transfer times, by adding additional time penalties for transferring between underground and overground services, to reflect the extra time required to travel up or down between stops. The sample feed included with this package only has "U" and "S" routes. The following code demonstrates how to use this distinction to add additional time penalties to the transfers table. The first step is to find the `trip_id` values of all "S" and "U" services, which is itself a two-step procedure to first extract `route_id` values and then match these to `trip_id` values. Procedures may vary between feeds, but in the case of Berlin, the routes are all identified by their `route_short_name`, as follows: ```{r SU_routes} S_routes <- gtfs$routes$route_id [grep ("^S", gtfs$routes$route_short_name)] U_routes <- gtfs$routes$route_id [grep ("^U", gtfs$routes$route_short_name)] S_trips <- gtfs$trips$trip_id [which (gtfs$trips$route_id %in% S_routes)] U_trips <- gtfs$trips$trip_id [which (gtfs$trips$route_id %in% U_routes)] ``` The following code then uses those `trip_id` values to extract all corresponding stops. ```{r SU_stops} S_stops <- gtfs$stop_times$stop_id [which (gtfs$stop_times$trip_id %in% S_trips)] S_stops <- unique (S_stops) U_stops <- gtfs$stop_times$stop_id [which (gtfs$stop_times$trip_id %in% U_trips)] U_stops <- unique (U_stops) ``` Although in Berlin the stops for "U" and "S" services are strictly separated, this may not be the case for all other feeds. It might, for example, be useful to remove stops which are used by both types of services, leaving stops that can only serve as transfers between different kinds of services. The following two lines suffice for that: ```{r SU_stops2} S_stops <- S_stops [which (!S_stops %in% U_stops)] U_stops <- U_stops [which (!U_stops %in% S_stops)] ``` All we then need to do is to add additional transfers times for any transfers between these two sets of stops, arbitrarily choosing here a value of 2 minutes: ```{r penalty} index <- which ((gtfs$transfers$from_stop_id %in% S_stops & gtfs$transfers$to_stop_id %in% U_stops) | (gtfs$transfers$from_stop_id %in% U_stops & gtfs$transfers$to_stop_id %in% S_stops)) gtfs$transfers$min_transfer_time [index] <- gtfs$transfers$min_transfer_time [index] + 120 ``` Our route query then returns the following: ```{r route3} gtfs_route ( gtfs, from = "Friedrichstr.", to = "Rosenthaler Platz", start_time = 12 * 3600, day = "Monday" ) ``` The route follows exactly the same connections, but now allows a far more realistic time of just over 5 minutes for the transfer at Alexanderplatz. This exemplifies that, while [the `gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html) can "automatically" generate transfer tables, these will often still need manual tweaking and adjustment to reflect the unique characteristics of any given system. ## Transfer distances {#transfer-distances} [The `gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html) includes an additional parameter, `d_limit`, quantifying the maximum permissible walking distance between transfers, with a default value of 200 metres. Increasing this value generates greater numbers of possible transfers, as the following code illustrates, starting by removing and re-creating the transfer table, followed by adding the additional penalties for transfers between the two types of services, this time constructed as a function for easy re-use: ```{r transfer_penalties_fn} gtfs$transfers <- NULL gtfs <- gtfs_transfer_table (gtfs, d_limit = 500) transfer_penalties <- function (gtfs, penalty = 120) { S_routes <- gtfs$routes$route_id [grep ("^S", gtfs$routes$route_short_name)] U_routes <- gtfs$routes$route_id [grep ("^U", gtfs$routes$route_short_name)] S_trips <- gtfs$trips$trip_id [which (gtfs$trips$route_id %in% S_routes)] U_trips <- gtfs$trips$trip_id [which (gtfs$trips$route_id %in% U_routes)] S_stops <- gtfs$stop_times$stop_id [which (gtfs$stop_times$trip_id %in% S_trips)] S_stops <- unique (S_stops) U_stops <- gtfs$stop_times$stop_id [which (gtfs$stop_times$trip_id %in% U_trips)] U_stops <- unique (U_stops) S_stops <- S_stops [which (!S_stops %in% U_stops)] U_stops <- U_stops [which (!U_stops %in% S_stops)] index <- which ((gtfs$transfers$from_stop_id %in% S_stops & gtfs$transfers$to_stop_id %in% U_stops) | (gtfs$transfers$from_stop_id %in% U_stops & gtfs$transfers$to_stop_id %in% S_stops)) gtfs$transfers$min_transfer_time [index] <- gtfs$transfers$min_transfer_time [index] + 120 return (gtfs) } gtfs <- transfer_penalties (gtfs) ``` Submitting the same routing query now gives the following: ```{r route4} gtfs_route ( gtfs, from = "Friedrichstr.", to = "Rosenthaler Platz", start_time = 12 * 3600, day = "Monday" ) ``` The route departs and arrives at the same time, but now includes an additional walking transfer over a distance which can be calculated by repeating the lines above: ```{r route4b} r <- gtfs_route ( gtfs, from = "Friedrichstr.", to = "Rosenthaler Platz", start_time = 12 * 3600, day = "Monday", include_ids = TRUE ) s <- gtfs$stops [match (r$stop_id [2:3], gtfs$stops$stop_id), ] as.numeric (geodist::geodist (s [1, ], s [2, ])) ``` At a speed of 1.11m/s, this would take just under 6 minutes. With the additional penalty of 2 minutes for transferring from an "S" to a "U" service, this transfer would require just under 8 minutes, which is less than the time given in the route of just over 8.5 minutes. This demonstrates how increasing maximal walking distances increases numbers of possible transfers. The following code shows the sizes of transfer tables as a function of maximal walking distances: ```{r d_limit1} d_limit <- 1:20 * 100 n <- vapply (d_limit, function (i) { gtfs$transfer <- NULL gtfs <- gtfs_transfer_table (gtfs, d_limit = i) nrow (gtfs$transfers) }, integer (1)) d_limit <- d_limit / 1000 # in km n <- n / 1000 # in thousands plot ( d_limit, n, type = "l", col = "red", lwd = 2, xlab = "Maximal transfer distance", ylab = "Number of transfers (1000s)" ) ``` The line is initially flat because this number (around 3,300) is the number of transfers possible between stops which have identical spatial locations (and so a distance of 0.0). Numbers of transfers can never be less than this number. The following section delves further into the effect of this distance limit, though examining the result of [`gtfs_traveltimes()`](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_traveltimes.html) for different distance limits. ## Transfer distances and travel times The preceding result might be sensibly interpreted to reflect something like a plausible upper limit for a transfer, with a distance of 370 metres, plus a vertical transfer between underground and overground services, expected to be covered in under 8.5 minutes. This might -- and indeed likely should -- in turn be considered to demonstrate how this `d_limit` parameter, along with any additional manual adjustments applied to transfer tables, like the `transfer_penalties()` function constructed above, need to be carefully considered with regard to concrete knowledge of a particular system. The `gtfsrouter` package is nevertheless not intended to be used only to generate individual routing queries, but also includes [the `gtfs_traveltimes()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_traveltimes.html) designed to enable efficient estimation of travel times anywhere in a system from any nominated station. The result of this function will obviously be influenced by the transfers table, and in particular by the maximal walking distance used to construct transfer tables with [the `gtfs_transfer_table()` function](https://UrbanAnalyst.github.io/gtfsrouter/reference/gtfs_transfer_table.html). The effect of different distances can also be examined using the internal data set provided with this package. The following function generates travel times for a given `d_limit` a random sample of starting stations: ```{r travel_times_d_limit-fn, message = FALSE} library (dplyr) library (hms) # 'parse_hms' function travel_times_d_limit <- function (gtfs, from, d_limit = 200) { gtfs$transfers <- NULL gtfs <- gtfs_transfer_table (gtfs, d_limit = d_limit) gtfs <- transfer_penalties (gtfs) start_time_limits <- 12:13 * 3600 get_one_times <- function (gtfs, from, start_time_limits) { x <- gtfs_traveltimes ( gtfs, from = from, start_time_limits = start_time_limits, quiet = TRUE ) ret <- NULL if (nrow (x) > 0) { # Convert duration to minutes: dur <- as.integer (parse_hms (x$duration)) / 60 ret <- data.frame ( from = from, to = x$stop_id, duration = dur ) } return (ret) } res <- lapply (from, function (i) { get_one_times (gtfs, i, start_time_limits) }) res <- do.call (rbind, res) %>% group_by (from, to) %>% summarise (duration = mean (duration), .groups = "drop") res$d_limit <- d_limit return (res) } ``` This function returns travel times from all stations listed in `from` to all other (reachable) stations for the specified value of `d_limit`. The following lines then compare these travel times between identical origin and destination stations for a range of distance limits. ```{r plot1, warning = FALSE, message = FALSE} set.seed (10L) from <- sample (gtfs$stops$stop_name, size = 10) d_limit <- 1:10 * 100 x <- lapply (d_limit, function (i) { travel_times_d_limit (gtfs, from = from, d_limit = i) }) xall <- do.call (rbind, x) [, c ("from", "to")] xall <- xall [which (!duplicated (xall)), ] for (i in seq_along (x)) { y <- left_join (xall, x [[i]], by = c ("from", "to")) xall [paste0 ("d_", x [[i]]$d_limit [1])] <- y$duration } index <- which (xall$d_1000 != xall$d_100) # record number of stations for which travel times did not change: nsame <- nrow (xall) - length (index) xall <- xall [index, ] # travel times only: times <- xall %>% select (!c ("from", "to")) # differences with each increase in d_limit: times <- t (apply (times, 1, function (i) { diff (i) })) times <- data.frame (t = as.vector (times)) library (ggplot2) ggplot (times, aes (x = t)) + stat_bin (bins = 30, col = "red", fill = "orange") + scale_y_log10 () + xlab ("time difference (minutes)") ``` And these additional transfer possibilities can make a significant difference to estimated travel times. Note, however, that the travel times only changed for `r length (index)` stations, while the value of `nsame` value recording numbers of stations for which travel times did not change was `r format (nsame, big.mark = ",")`. Travel times may also increase as additional transfers may be used to reduce numbers of connecting services required for a particular trip, and the algorithm generally searches for connections with minimal numbers of transfers. The following code the quantifies the overall effect of these reductions in travel times, measured as the linear reduction in comparison with travel times calculated using the original transfers table provided with the feed. The function also returns the proportion of all stations for which travel times were affected by the modified transfer table. ```{r transfer_difference_fn} transfer_difference <- function (gtfs, nsamples = 10, d_limit = 200, day = "Monday", start_time_limits = 12:13 * 3600) { g1 <- data.table::copy (gtfs) g1 <- gtfs_timetable (g1, day = day) g2 <- data.table::copy (gtfs) g2 <- gtfs_transfer_table (g2, d_limit = d_limit, quiet = TRUE) g2 <- gtfs_timetable (g2, day = day) get1 <- function (gtfs, gtfs2, start_time_limits) { from <- sample (gtfs$stops$stop_name, size = 1) x1 <- gtfs_traveltimes ( gtfs, from = from, start_time_limits = start_time_limits, quiet = TRUE ) x2 <- gtfs_traveltimes ( gtfs2, from = from, start_time_limits = start_time_limits, quiet = TRUE ) if (nrow (x1) == 0L | nrow (x2) == 0L) { return (rep (NA, 2)) } x2 <- data.frame ( stop_id = x2$stop_id, duration2 = x2$duration ) x2 <- dplyr::left_join (x1, x2, by = "stop_id") dat <- data.frame ( x = as.integer (parse_hms (x2$duration)) / 60, y = as.integer (parse_hms (x2$duration2)) / 60 ) mod <- lm (y ~ x + 0, data = dat) return (c ( prop = length (which (dat$x != dat$y)) / nrow (dat), change = as.numeric (mod$coefficients) )) } vapply (seq (nsamples), function (i) { get1 (g1, g2, start_time_limits) }, numeric (2), USE.NAMES = FALSE ) } ``` That function can then be used to loop over a range of values of `d_limit`, and to examine the results with the following code: ```{r transfer_d_limit_results, warning = FALSE} d_limit <- 1:10 * 100 set.seed (1L) d <- lapply (d_limit, function (i) { transfer_difference ( gtfs, nsamples = 20, d_limit = i, day = "Monday", start_time_limits = 12:13 * 3600 ) }) d <- lapply (seq_along (d_limit), function (i) { data.frame ( d_limit = d_limit [i], prop = d [[i]] [1, ], change = d [[i]] [2, ] ) }) d <- do.call (rbind, d) ggplot (d, aes (x = d_limit, y = change)) + geom_point (pch = 19, col = "orange") + geom_smooth (method = "loess", formula = "y ~ x") ``` That result shows that travel times do decrease with increasing limits of maximal transfer distance, although the overall effect is very minimal. The object `d` also records the proportion of station pairs for which modifications to the transfers table actually changed travel times, which in this case amounts to a mean value of `r round (mean (d$prop, na.rm = TRUE), digits = 4)`, clearly revealing that very few trips actually change. ### A more realistic example The following results apply the code given above to the full feed for Stuttgart, Germany, showing what might more typically be expected as the `d_limit` parameter is increased. ```{r stuttgart-results, echo = FALSE, warning = FALSE} d <- list () # 1 d [[1]] <- rbind ( c ( 0.04787234, 0.3024283, 0.2137809, 0.4226190, 0.4782609, 0, 0.1996497, 0.0324826, 0.3122142, 0.1051136, 0.1330166, 0.1755486, 0.3659229, NA, 0.2471751, 0.03344482, 0.3617021, 0.3400000, NA, 0.05964215 ), c ( 0.97835370, 0.9045062, 0.9552470, 0.8157467, 0.8343242, 1, 0.8956478, 0.9857173, 0.9197187, 0.9813956, 0.9488823, 0.9505516, 0.8601199, NA, 0.9367492, 0.99126114, 0.8190608, 0.9064265, NA, 0.97887272 ) ) # 2 d [[2]] <- rbind ( c ( 0.2675343, 0.3167203, 0.4016824, 0.3119881, 0.3680556, 0.09902913, 0.1257367, 0.1349578, 0.1366906, 0.4723442, 0.2439614, 0.2401575, 0.2959184, 0.1376147, 0.4624113, 0.1189189, 0.2706450, 0.1621622, 0.01632653, 0.3847584 ), c ( 0.9333812, 0.8716062, 0.8790074, 0.9153534, 0.8889449, 0.97546764, 0.9747218, 0.9620882, 0.9235575, 0.8427352, 0.8973988, 0.8534331, 0.8745992, 0.9666719, 0.8746626, 0.9395889, 0.9442811, 0.9733787, 0.99658716, 0.8350441 ) ) # 3 d [[3]] <- rbind ( c ( 0.4449339, 0.3257576, 0.08494208, 0.07758621, 0.2345679, 0.009615385, 0.1959877, 0.06282723, 0.05785124, 0.4670588, 0.0562500, 0.1203704, 0.1000000, 0.1720779, 0.4771167, 0.4372331, 0.2102273, 0.1291449, 0.4567901, 0 ), c ( 0.8480413, 0.8881877, 0.97683839, 0.95861325, 0.8839587, 0.990835401, 0.9353503, 0.97107607, 0.97832732, 0.8946960, 0.9568263, 0.9783830, 0.9800815, 0.9538397, 0.8978858, 0.8865221, 0.9256468, 0.9507978, 0.8657352, 1 ) ) # 4 d [[4]] <- rbind ( c ( 0.7241379, 0.2611807, 0.2901961, 0.02531646, 0.5215054, 0.2336735, 0.5795779, 0.1518987, NA, 0.0625000, NA, 0, 0.6513274, 0.4292683, 0.3418573, 0.1452703, 0.2619543, 0.4628450, 0.2417375, 0.3950216 ), c ( 0.7022668, 0.9451780, 0.8701041, 0.98733569, 0.8588725, 0.9451916, 0.8372441, 0.9076009, NA, 0.9842143, NA, 1, 0.7893403, 0.9412012, 0.9203506, 0.9627622, 0.9228610, 0.8665221, 0.9418134, 0.9021016 ) ) # 5 d [[5]] <- rbind ( c ( 0.7239508, 0.8513952, 0.1402439, 0.1448468, 0.4007092, 0.09883721, 0.08634538, 0.2936747, 0.7217235, 0.5807783, 0.7090415, 0.1440678, 0.4746581, 0.2164179, 0.05633803, 0.3898587, NA, 0.1363636, 0.6066116, 0.2217195 ), c ( 0.8041968, 0.7696561, 0.9385228, 0.9466528, 0.8734857, 0.97151782, 0.97918464, 0.9076568, 0.7783634, 0.8431704, 0.7343155, 0.9412296, 0.8566643, 0.9716889, 0.99403550, 0.9124251, NA, 0.9785596, 0.8332798, 0.9380071 ) ) # 6 d [[6]] <- rbind ( c ( 0.4354430, 0.6561666, 0.2995392, NA, 0.3553922, 0.3404826, 0.2711864, 0.2535433, 0.4971098, 0.6451231, 0.117378, 0, 0.6751592, 0.4408353, 0.6392199, 0.5826598, NA, 0.1823770, 0.2105263, 0.1016043 ), c ( 0.7925198, 0.8567067, 0.9134449, NA, 0.8825668, 0.9020107, 0.9247888, 0.9654961, 0.8274002, 0.7975106, 0.979065, 1, 0.7964435, 0.8632960, 0.8738289, 0.7831721, NA, 0.9436972, 0.9274683, 0.9744991 ) ) # 7 d [[7]] <- rbind ( c ( 0.5275288, 0.8809990, 0.2042254, 0.2562101, 0, 0.3214286, 0.3479730, 0.2478032, 0.3501229, 0.5036298, 0.0560000, 0.7117819, 0.4114833, 0.2748092, 0.1865342, 0.2213115, 0.1674208, 0.3247863, 0.01388889, 0.2974013 ), c ( 0.8974057, 0.7428764, 0.9503437, 0.9217992, 1, 0.9228010, 0.9036412, 0.9161946, 0.9596850, 0.8494607, 0.9632148, 0.7661997, 0.8901800, 0.9080806, 0.9574235, 0.9429245, 0.9445154, 0.9383823, 0.99440270, 0.9226305 ) ) # 8 d [[8]] <- rbind ( c ( 0.4239291, 0.6470588, 0.4225146, 0.2475728, 0.4358491, 0.4682081, 0.8519306, 0.5047619, 0.03007519, 0.4723404, 0.1925676, 0.2028470, 0.5584677, 0.2478386, 0.06122449, 0, 0.1219512, 0.6993865, 0.3742455, NA ), c ( 0.9030931, 0.8842219, 0.8946512, 0.9000109, 0.8482130, 0.9018473, 0.7631321, 0.6506485, 0.98723649, 0.8749593, 0.9249907, 0.9497207, 0.8826570, 0.9193413, 0.97003127, 1, 0.8942574, 0.8027865, 0.8630442, NA ) ) # 9 d [[9]] <- rbind ( c ( 0.1160991, 0.4269663, 0.2256858, 0.3259604, 0.7434613, 0.3766376, 0.3615819, 0.5671642, 0.1721311, 0.7014749, 0.1000000, 0.07798165, 0.1622276, 0.4407583, 0.3865188, 0.2470309, 0.3603175, 0.1516588, 0.3102310, 0.4602133 ), c ( 0.9666432, 0.7528710, 0.9193014, 0.9227199, 0.7834500, 0.8515732, 0.8863982, 0.8002182, 0.9434899, 0.7971016, 0.9594046, 0.98316892, 0.9457987, 0.8288464, 0.9053315, 0.8986293, 0.8698507, 0.9590871, 0.8651403, 0.8851207 ) ) # 10 d [[10]] <- rbind ( c ( 0.3747811, NA, 0.4325744, NA, 0.7301301, 0.06694561, 0.2693267, 0.2651163, 0.5702840, 0.5865947, 0.6845638, 0.1111111, 0.5206897, NA, 0.2322275, 0.2739726, 0.03968254, 0.1763367, 0.9158513, 0.3980100 ), c ( 0.8318046, NA, 0.8894870, NA, 0.8242255, 0.97770335, 0.8996235, 0.8986981, 0.8131951, 0.8357680, 0.8225032, 0.8502140, 0.8353131, NA, 0.9366721, 0.8817567, 0.99278373, 0.9559705, 0.6732793, 0.8348557 ) ) d <- lapply (seq_along (d_limit), function (i) { data.frame ( d_limit = d_limit [i], prop = d [[i]] [1, ], change = d [[i]] [2, ] ) }) d <- do.call (rbind, d) ggplot (d, aes (x = d_limit, y = change)) + geom_point (pch = 19, col = "orange") + geom_smooth (method = "loess", formula = "y ~ x") ``` The decrease in travel times in this case approaches 15% as transfers are implemented with walks of up to 1km, although values are highly variable depending on starting stations. The equivalent proportion of stops for which these transfers actually change travel times is shown in the following graph: ```{r stuttgart-plot2, warning = FALSE} ggplot (d, aes (x = d_limit, y = prop)) + geom_point (pch = 19, col = "orange") + geom_smooth (method = "loess", formula = "y ~ x") ``` While travel times to most stops still remain unaffected, the proportion affected does increase to a large minority of around 30-40% as transfer distances approach 1km. ```{r DTthread-reset, echo = FALSE} data.table::setDTthreads (nthr) ```