vlarmet / cpprouting Goto Github PK

Algorithms for Routing and Solving the Traffic Assignment Problem

R 24.77% C++ 75.17% C 0.07%

r dijkstra-algorithm isochrones shortest-paths parallel-computing rcpp c-plus-plus distance bidirectional-a-star-algorithm contraction-hierarchies

cpprouting's Introduction

Support Our Open Source Project

Creating and maintaining an open source project takes significant time and effort. If you find value in our work and would like to see it continue to grow and improve, please consider making a donation. Your support helps us cover hosting costs, development tools, and allows us to dedicate more time to enhancing this project. Every contribution, no matter how small, makes a big difference. Thank you for supporting open source!

cppRouting v3 : Algorithms for Routing and Solving the Traffic Assignment Problem

Vincent LARMET November 24, 2022

Package presentation
Installation
- Stable version from CRAN
- or from github
What are we talking about ?
Readme Data
- Set the number of threads used by cppRouting
- Instantiate the graph
Main functions
Routing
Network simplification
Traffic assignment
- All-or-Nothing (AON)
  - Choosing the best routing algorithm
- User Equilibrium (UE)
Algorithm compatibility
Parallel implementation
Applications
Benchmark with other R packages
Citation

Package presentation

cppRouting is an R package which provide routing algorithms (shortest paths/distances, isochrones) and traffic assignment solvers on non-negative weighted graphs.
cppRouting is characterized by :

its ability to work on large road graphs (country/continent scale)
its large choice of one-to-one shortest path algorithms
its implementation of contraction hierarchies and associated routing algorithms
its large choice of algorithms for solving the traffic assignment problem (TAP)
its high performance through memory usage and parallel programming

cppRouting is therefore particularly adapted for geographer, or whoever who need to calculate accessibility indicators at large scale.
All algorithms are written in C++ and mainly use containers from the Standard Template Library (STL).
This package have been made with Rcpp and RcppParallel packages.

Installation

Stable version from CRAN

install.packages("cppRouting")

or from github

library(remotes)
remotes::install_github("vlarmet/cppRouting")

What are we talking about ?

cppRouting implements algorithms belonging to graph theory, so let’s define what a graph is.
A graph is commonly used to represent a network, which is composed of vertices connected by edges.

In cppRouting, an edge has at least three attributes : vertice’s ID from which it start, vertice’s ID from which it end and a weight (length, flow, travel time …).

Readme Data

**This README file and all time measurements were made on a Windows 10 computer, with 11th generation i5 (6 cores) processor and 32GB of memory. **
The data presented here is the official french road network describing over 500000 km of roads.
All data used in this README are free and can be downloaded here :

roads : http://professionnels.ign.fr/route500
general practitioners location : https://www.insee.fr/fr/statistiques/3568614?sommaire=3568656#consulter
maternity wards location : https://www.insee.fr/fr/statistiques/3568611?sommaire=3568656#dictionnaire
shapefile of the ~36000 communes in France : http://professionnels.ign.fr/adminexpress
commuting to work from the French national census : https://www.insee.fr/fr/statistiques/3566477#consulter

Graph data have been preprocessed for more readability (see data_preparation.R).

The final graph is composed of 234615 nodes and 685118 edges.
Data has to be a 3 columns data.frame or matrix containing from, to and a cost/distance column. Here the cost is the time needed to travel in each edges (in minutes). From and to are vertices IDs (character or numeric).

library(cppRouting)
library(dplyr)
library(sf)
library(ggplot2)
library(concaveman)
library(ggmap)
library(tmap)
library(microbenchmark)
library(reshape2)
library(kableExtra)

#Reading french road data
roads  <-  read.csv("roads.csv",colClasses = c("character","character","numeric"))
#Shapefile data of communes (polygons)
com  <-  read_sf("com_simplified_geom.shp")
#Correspondance file between communes and nodes in the graph (nearest node to each commune centroid)
ndcom  <-  read.csv("node_commune.csv",colClasses = c("character","character","numeric"))
#General practitioners locations
med  <-  read.csv("doctor.csv",colClasses = c("character","numeric","character","numeric"))
#Import materinty ward locations
maternity  <-  read.csv("maternity.csv",colClasses = c("character","numeric"))
#Commuting data from national census
load("commuting.Rds")
#Import nodes coordinates (projected in EPSG : 2154)
coord  <-  read.csv("coordinates.csv",colClasses = c("character","numeric","numeric"))

Head of road network data

head(roads)

##   from     to    weight
## 1    0 224073 0.4028571
## 2    1  65036 3.5280000
## 3    2 173723 1.8480000
## 4    3      2 2.5440000
## 5    4 113129 4.9680000
## 6    5      4 1.6680000

Head of coordinates data

head(coord)

##   ID        X       Y
## 1  0 805442.8 6458384
## 2  1 552065.9 6790520
## 3  2 556840.2 6790475
## 4  3 554883.7 6790020
## 5  4 548345.2 6791000
## 6  5 547141.3 6790434

Set the number of threads used by `cppRouting`

RcppParallel::setThreadOptions(numThreads = 1)

Instantiate the graph

#Instantiate a graph with coordinates
graph  <-  makegraph(roads, directed = T, coords = coord)

Graph object have some useful attributes for the user :

graph$nbnode : total number of vertices,
graph$dict$ref : vertices ids.

Other attributes are internals data and have no interest for the user. All graph attributes should never be modified by the user.

Main functions

cppRouting package provide these functions :

get_distance_matrix : compute distance matrix (between all combinations origin-destination nodes - one-to-many),
get_distance_pair : compute distances between origin and destination by pair (one-to-one),
get_path_pair : compute shortest paths between origin and destination by pair (one-to-one),
get_multi_paths : compute shortest paths between all origin nodes and all destination nodes (one-to-many),
get_isochrone : compute isochrones/isodistances with one or multiple breaks.
get_detour : return nodes that are reachable within a fixed additional cost around shortest paths. This function can be useful in producing accessibility indicators.
cpp_simplify : remove non-intersection nodes, duplicated edges and isolated loops in the graph. Graph topology is preserved so distance calculation is faster and remains true. This function can be applied to very large graphs (several millions of nodes).
cpp_contract : contract the graph by applying contraction hierarchies algorithm.
get_aon : given an origin-destination matrix, compute All-or-Nothing assignment.
assign_traffic : given an origin-destination matrix, estimate the traffic flows on the network.

Routing

As the package name suggest, cppRouting is initially aimed to provide efficient algorithms for finding shortest paths.

Algorithms

Path algorithms proposed by the package are :

1 uni-directional Dijkstra algorithm,
2 bi-directional Dijkstra algorithm,
3 uni-directional A* algorithm
4 New bi-directional A* algorithm (Piljs & Post, 2009 : see http://repub.eur.nl/pub/16100/ei2009-10.pdf)
5 one-to-one bi-directional Dijkstra adapted to contraction hierarchies (Geisberger & al., 2008)
6 many-to-many bi-directional Dijkstra adapted to contraction hierarchies (Geisberger & al., 2008)
7 PHAST algorithm (Hardware-accelerated shortest path trees), one-to-all algorithm adapted to contraction hierarchies (Delling & al., 2011)

1, 2, 3 and 4 are available for one-to-one calculation in get_distance_pair and get_path_pair functions on a non-contracted graph. In these functions, uni-directional Dijkstra algorithm is stopped when the destination node is reached.
A* and NBA* are relevant if geographic coordinates of all nodes are provided. Note that coordinates should be expressed in a projection system.
To be accurate and efficient, A* and NBA* algorithms should use an admissible heuristic function (here the Euclidean distance), i.e cost and heuristic function must be expressed in the same unit.
In cppRouting, heuristic function h for a node (n) is defined such that :
h(n,d) = ED(n,d) / k with h the heuristic, ED the Euclidean distance, d the destination node and a constant k.
So in the case where coordinates are expressed in meters and cost is expressed in time, k is the maximum speed allowed on the road. By default, constant is 1 and is designed for graphs with cost expressed in the same unit than coordinates (for example in meters).
If coordinates cannot be provided, bi-directional Dijkstra algorithm is the best option in terms of performance.

5 is used for one-to-one calculation in get_distance_pair and get_path_pair functions on a contracted graph.

1 is used for one-to-many calculation in get_distance_matrix function on a non-contracted graph.

6 and 7 are available for one-to-many calculation in get_distance_matrix function on a contracted graph.

Let’s compare different path algorithms in terms of performance.
For A* and NBA algorithms, coordinates are defined in meters and max speed is 110km/h; so for the heuristic function to be admissible, we have to convert meters to minutes by setting constant to 110/0.06 :

#Generate 2000 random origin and destination nodes
origin  <-  sample(graph$dict$ref,  2000)
destination  <-  sample(graph$dict$ref,  2000)
microbenchmark(dijkstra=pair_dijkstra  <-  get_distance_pair(graph, origin, destination, algorithm = "Dijkstra"), 
               bidir=pair_bidijkstra  <-  get_distance_pair(graph, origin, destination, algorithm = "bi"), 
               astar=pair_astar  <-  get_distance_pair(graph, origin, destination, algorithm = "A*", constant = 110/0.06), 
               nba=pair_nba  <-  get_distance_pair(graph, origin, destination, algorithm = "NBA", constant = 110/0.06), 
               times=1)

## Unit: seconds
##      expr       min        lq      mean    median        uq       max neval
##  dijkstra 28.648378 28.648378 28.648378 28.648378 28.648378 28.648378     1
##     bidir 22.463780 22.463780 22.463780 22.463780 22.463780 22.463780     1
##     astar 19.207150 19.207150 19.207150 19.207150 19.207150 19.207150     1
##       nba  9.976939  9.976939  9.976939  9.976939  9.976939  9.976939     1

Output

head(cbind(pair_dijkstra,pair_bidijkstra,pair_astar,pair_nba))

##      pair_dijkstra pair_bidijkstra pair_astar pair_nba
## [1,]      258.4616        258.4616   258.4616 258.4616
## [2,]      464.5499        464.5499   464.5499 464.5499
## [3,]      294.5504        294.5504   294.5504 294.5504
## [4,]      302.7190        302.7190   302.7190 302.7190
## [5,]      120.1393        120.1393   120.1393 120.1393
## [6,]      578.0233        578.0233   578.0233 578.0233

So, how to choose the algorithm ? It’s simple, the faster, the better. If coordinates are provided, go for NBA, else go for bidirectional Dijkstra. Uni-directional Dijkstra and A* algorithms should be used if main memory is (almost) full because they require only one graph instead of two (forward and backward).

Compute isochrones

An isochrone is a set of nodes reachable from a node within a fixed limit.
Let’s compute isochrones around Dijon city

#Compute isochrones
iso  <-  get_isochrone(graph, from = "205793", lim = c(15, 25, 45, 60, 90, 120))
#Convert nodes to concave polygons with concaveman package
poly  <-  lapply(iso[[1]], function(x){
  x  <-  data.frame(noeuds=x, stringsAsFactors = F)
  x  <-  left_join(x, coord, by=c("noeuds"="ID"))
  return(concaveman(summarise(st_as_sf(x, coords=c("X", "Y"), crs=2154))))
})

poly  <-  do.call(rbind, poly)
poly$time  <-  as.factor(names(iso[[1]]))
#Multipolygon
poly2  <-  st_cast(poly, "MULTIPOLYGON")
poly2$time  <-  reorder(poly2$time, c(120, 90, 60, 45, 25, 15))
#Reproject for plotting
poly2  <-  st_transform(poly2, "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
#Import map backgroung

dijon   <-   get_stamenmap(bbox = c(left = 1.708, 
                             bottom = 45.126, 
                             right = 8.003, 
                             top = 49.232),  maptype = "toner-2010",  zoom = 7)
#Plot the map
p  <-  ggmap(dijon)+
  geom_sf(data=poly2, aes(fill=time), alpha=.8, inherit.aes = FALSE)+
  scale_fill_brewer(palette = "YlOrRd")+
  labs(fill="Minutes")+
  ggtitle("Isochrones around Dijon")+
  theme(axis.text.x = element_blank(), 
        axis.text.y = element_blank(), 
        axis.ticks = element_blank(), 
        axis.title.y=element_blank(), axis.title.x=element_blank())
p

Compute possible detours within a fixed additional cost

get_detour function returns all reachable nodes within a fixed detour time around the shortest path between origin and destination nodes. Returned nodes (n) meet the following condition :
SP(o,n) + SP(n,d) < SP(o,d) + t
with SP shortest distance/time, o the origin node, d the destination node and t the extra cost.
The algorithm used is a slightly modified bidirectional Dijkstra.
Let’s see an example for the path between Dijon and Lyon city :

#Compute shortest path
trajet <- get_path_pair(graph,from="205793",to="212490")

#Compute shortest path
distance <- get_distance_pair(graph,from="205793",to="212490")

#Compute detour time of 25 and 45 minutes
det25 <- get_detour(graph,from="205793",to="212490",extra=25)
det45 <- get_detour(graph,from="205793",to="212490",extra=45)

#Create sf object of nodes
pts <- st_as_sf(coord,coords=c("X","Y"),crs=2154)
pts <- st_transform(pts,crs=4326)
pts$time <- ifelse(pts$ID %in% unlist(det45),"45","0")
pts$time <- ifelse(pts$ID %in% unlist(det25),"25",pts$time)
pts$time <- ifelse(pts$ID %in% unlist(trajet),"Shortest Path",pts$time)
pts$time <- factor(pts$time,levels = c("25","45","Shortest Path","0"))

#Plot
dijon   <-   get_stamenmap(bbox = c(left = 3.2, 
                             bottom = 45.126, 
                             right = 6.5, 
                             top = 47.8),  maptype = "toner-2010",  zoom = 8)

p <- ggmap(dijon)+
  geom_sf(data=pts[pts$time!="0",],aes(color=time),inherit.aes = FALSE)+
  ggtitle(paste0("Detours around Dijon-lyon path - ",round(distance,digits = 2)," minutes"))+
  labs(color="Minutes")+
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        axis.title.y=element_blank(),axis.title.x=element_blank())
p

Contraction hierarchies

Contraction hierarchies is a speed-up technique for finding shortest path on a network. It was proposed by Geisberger & al.(2008).
Initially created for one-to-one queries, it has been extended to many-to-many and distance matrix calculation.
This technique is composed of two phases:

preprocessing phase called contraction with cpp_contract function
query phase : a slightly modified version of bidirectional search for one-to-one query, available in get_distance_pair and get_path_pair; PHAST algorithm and a many-to-many algorithm using buckets available in get_distance_matrix function.

Contraction phase consists of iteratively removing a vertex v from the graph and creating a shortcut for each pair (u,w) of v’s neighborhood if the shortest path from u to w contains v. To be efficient and avoid creating too much shortcuts, vertices have to be ordered according to several heuristics. The two heuristics used by cppRouting are :

edge difference (number of shortcuts potentially created by removing v - number of incoming edges - number of outcoming edges)
deleted neighbors (number of already contracted neighbors)

The nodes are initially ordered using only edge difference, then importance of v is lazily updated during contraction phase with the combined two heuristics. To see more detailed explanations, see these ressources :

#Contraction of input graph
graph3 <- cpp_contract(graph, silent=TRUE)

#Calculate distances on the contracted graph
system.time(
  pair_ch <- get_distance_pair(graph3, origin, destination)
)

## utilisateur     système      écoulé 
##        0.27        0.01        0.28

Compare outputs

summary(pair_ch-pair_dijkstra)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##       0       0       0       0       0       0      38

Performance comparison

Distance by pair

Here are the measurements of contraction time and query time (in second) of contraction hierarchies on different graphs :

					Number of queries
Graph	Vertices	Edges	preprocessing	algorithm	1000	2000	5000	10000	100000	1000000
ROUTE 500	234615	685118	14	ch	0.22	0.25	0.44	0.64	4.60	45.86
ROUTE 500	234615	685118	14	bi	11.00	22.00	55.00	110.00	1100.00	11000.00
ROUTE 500	234615	685118	14	nba	5.05	10.10	25.25	50.50	505.00	5050.00
OSM France	4559270	10389741	154	ch	4.08	3.92	4.66	5.11	18.20	147.05
OSM France	4559270	10389741	154	bi	309.40	618.80	1547.00	3094.00	30940.00	309400.00
OSM France	4559270	10389741	154	nba	144.80	289.60	724.00	1448.00	14480.00	144800.00
OSM Europe	16210743	36890020	536	ch	13.44	13.61	14.50	15.91	35.14	227.22
OSM Europe	16210743	36890020	536	bi	1159.20	2318.40	5796.00	11592.00	115920.00	1159200.00
OSM Europe	16210743	36890020	536	nba	630.60	1261.20	3153.00	6306.00	63060.00	630600.00
ch : bidirectional search on the contracted graph
bi : bidirectional search on the original graph
nba : new bidirectional A\* on the original graph

Here are the plots (in log-log) of query time improvement factor of one to one CH algorithm compared to bidirectional Dijkstra and NBA :

As we can see on the plot, the larger is the graph, the higher is the benefit of using contraction hierarchies. For OSM Europe, query time can be faster by a factor of 5000 compared to bidirectional Dijkstra and 2500 to NBA.

Distance matrix

Here are the measurements of query time (in second) of contraction hierarchies on different graphs.
We compare PHAST and many to many CH to Dijkstra algorithm on square matrix (i.e the sets of source and target nodes are of equal length).

		\|S\|=\|T\|
Graph	algorithm	1000	2000	5000	10000
ROUTE 500	mch	0.64	1.31	5.19	19.31
ROUTE 500	phast	4.21	8.42	21.05	42.10
ROUTE 500	Dijkstra	27.45	54.90	137.25	274.50
OSM France	mch	11.15	18.66	46.13	102.18
OSM France	phast	100.50	201.00	502.50	1005.00
OSM France	Dijkstra	781.50	1563.00	3907.50	7815.00
OSM Europe	mch	39.17	64.50	144.39	291.28
OSM Europe	phast	460.00	920.00	2300.00	4600.00
OSM Europe	Dijkstra	2886.40	5772.80	14432.00	28864.00
mch : many-to-many bidirectional search on the contracted graph
phast : phast algorithm on the contracted graph
Dijkstra : Dijkstra search on the original graph
\\|S\\| : number of source nodes
\\|T\\| : number of target nodes

Here are the plots (in log-log) of query time improvement factor of PHAST and many to many CH compared to Dijkstra algorithm :

Benefits are less important than one-to-one queries but still interesting. For OSM Europe, query time can be faster by a factor of *100.
PHAST’s improvement is constant since it iteratively perform an *one-to-all* search, just like original Dijkstra.
many to many CH is well adapted for **square matrix**.

Here are the plots of query time of PHAST and many to many CH on assymetric matrix (i.e. number of source and number of target are unequal) with |S| / |T| the number of sources divided by the number of targets :

Note that this plot is the same for |T| / |S|.
PHAST algorithm is much faster for rectangular matrix. The rate |S| / |T| where many to many CH is better varies according the graph size. For example, if we have to calculate a distance matrix between 10000 sources and 10 targets (or 10 sources and 10000 targets) on OSM France, we must use PHAST. On the other hand, if we want a matrix of 10000 sources and 8000 targets, we use many to many CH algorithm.

Work with dual weighted network

Sometimes it can be useful to sum up additional weights along the shortest path. For a use-case example, let’s say we want to compute the distance along the shortest time path or the time needed to travel the shortest distance path. It is now possible to set an auxiliary set of edge weights during graph construction in makegraph() function and set aggregate_aux to TRUE in get_distance_* functions.
Let’s see an example where we would like to compute the number of edges within each shortest path :

# The weight to be minimized is set in 'df' argument. Additional weights are set in "aux"
# We set auxiliary weights to 1 in order to count number of edge in shortest paths
dgr <- makegraph(df = roads, directed = TRUE, coords = coord, aux = 1) 

# Compute number of edge
hops <- get_distance_pair(Graph = dgr, from = origin, to = destination, aggregate_aux = TRUE)

# plot
dfp <- data.frame(n_edges = hops, travel_time = pair_nba)
p <- ggplot(dfp, aes(x = travel_time, y = n_edges))+
  geom_point()+
  labs(x = "Travel time (min)", y = "Number of edge")+
  theme_bw()
p

Note that this functionality work for contracted graphs as well.

Network simplification

cpp_simplify’s internal function performs two major steps :

removing non-intersection nodes between two intersection nodes then calculate cost of the new edges,
removing duplicate edges that are potentially created in the first step.

In order to remove maximum number of nodes, some iterations are needed until only intersection nodes are remained.

Let’s see a small example :

library(igraph)
#Create directed graph
edges<-data.frame(from=c("a","b","c","d",
                         "d","e","e","e",
                         "f","f","g","h","h","h",
                         "i","j","k","k","k",
                         "l","l","l","m","m","m",
                         "n","n","o","p","q","r"),
                  to=c("b","c","d","e","k","f","d",
                       "h","g","e","f","e","i","k",
                       "j","i","h","d","l","k",
                       "m","n","n","o","l","l","m","m",
                       "r","p","q"),
                  dist=rep(1,31))

#Plotting with igraph
par(mfrow=c(1,2),mar=c(3,0,3,0))

igr1<-graph_from_data_frame(edges)
set.seed(2)
plot.igraph(igr1,edge.arrow.size=.3,main="Original graph")
box(col="black")

#Instantiate cppRouting graph, then simplify without iterations
graph_ex<-makegraph(edges,directed = TRUE)
simp<-cpp_simplify(graph_ex,rm_loop = FALSE)
#Convert graph to df
edges2<-to_df(simp)

#Plotting simplified graph
igr2<-graph_from_data_frame(edges2)
set.seed(20)
plot(igr2,edge.arrow.size=.3,edge.label=E(igr2)$dist,main="One iteration - keeping loop")
box(col="black")

Here, junction nodes are e, h, d, k, l, i and m. So b, c, f and n have been contracted in the first step of the function. By contracting n, an edge with cost of 2 has been created between m and l nodes.
The second step of the function has removed this edge which is greater than the original one (i.e 1), and the whole process now need a second iteration to remove m and l that aren’t intersection nodes anymore.
Let’s try with iterate argument set to TRUE :

par(mfrow=c(1,2),mar=c(3,0,3,0))
#Simplify with iterations
simp2<-cpp_simplify(graph_ex,rm_loop = FALSE,iterate = TRUE)
edges3<-to_df(simp2)
igr3<-graph_from_data_frame(edges3)
set.seed(2)
plot(igr3,edge.arrow.size=.3,edge.label=E(igr3)$dist,main="Second iteration - keeping loop")
box(col="black")

#The same but removing loops
simp3<-cpp_simplify(graph_ex,rm_loop = TRUE,iterate = TRUE)
edges4<-to_df(simp3)

igr4<-graph_from_data_frame(edges4)
set.seed(2)
plot(igr4,edge.arrow.size=.3,edge.label=E(igr4)$dist,main="Second iteration - removing loop")
box(col="black")

French road network simplification

#Simplify original graph by keeping nodes of interest
graph2<-cpp_simplify(graph,
                     iterate = TRUE,
                     keep = unique(c(origin,destination))) #we don't want to remove origin and destination nodes

#Running NBA*
system.time(
  pair_nba_2<-get_distance_pair(graph2,origin,destination,algorithm = "NBA",constant = 110/0.06)
)

## utilisateur     système      écoulé 
##        8.12        0.01        8.14

Compare outputs

summary(pair_nba-pair_nba_2)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##       0       0       0       0       0       0      38

Running time

Here are running times in second on graphs of different sizes :

data presented here,
OpenStreetMap french road network from Geofabrik,
an assembly of several european countries from OSM (France, Italy, Spain, Germany, Portugal, Switzerland, Belgium and Netherlands).

OpenStreetMap data have been extracted with osm2po tool, which I highly recommend.

Network	Nodes	Edges	Runtime without iterations	Removed nodes in first iteration	Runtime with iterations	Number of iteration	Total removed nodes	Removed nodes percentage
README data	234,615	685,118	0.47	39,737	0.51	4	41,843	17.83
OSM France	4,559,270	10,389,741	6.73	818,096	11.62	9	842,252	18.47
OSM ‘Europe’	16,210,743	36,890,020	27.94	3,365,240	46.47	11	3,465,724	21.38

Traffic assignment

Traffic assignment models are used to estimate the traffic flows on a network. It take as input origin-destinations matrix describing volume between each OD pair.
For this part, we will use standard networks massively used in traffic modelling studies : Siouxfalls and Chicago networks (source).
Let’s import Siouxfalls. It is composed of 24 nodes and 76 links.

net <- read.csv("siouxfalls_net.csv")
head(net)

##   Init.node Term.node  Capacity Length Free.Flow.Time    B Power Speed.limit
## 1         1         2 25900.201      6              6 0.15     4           0
## 2         1         3 23403.473      4              4 0.15     4           0
## 3         2         1 25900.201      6              6 0.15     4           0
## 4         2         6  4958.181      5              5 0.15     4           0
## 5         3         1 23403.473      4              4 0.15     4           0
## 6         3         4 17110.524      4              4 0.15     4           0
##   Toll Type
## 1    0    1
## 2    0    1
## 3    0    1
## 4    0    1
## 5    0    1
## 6    0    1

We also have OD matrix as well :

trips <- read.csv("siouxfalls_trips.csv")
head(trips)

##   from to demand
## 1    1  2    100
## 2    1  3    100
## 3    1  4    500
## 4    1  5    200
## 5    1  6    300
## 6    1  7    500

All-or-Nothing (AON)

All-or-Nothing assignment (AON) is the most simplistic (and fastest) method to load flow on a network, since it assume there is no congestion effects. The assignment algorithm itself is the procedure that loads the origin-destination matrix to the shortest path trees and produces the flows.
In cppRouting, OD matrix is represented as 3 vectors of equal length :

from : origin node,
to : destination node,
demand : volume.

Let’s load flows on the network using get_aon() function :

sgr <- makegraph(df = net[,c("Init.node", "Term.node", "Free.Flow.Time")], directed = TRUE) 

# Compute AON assignment using OD trips
aon <- get_aon(Graph = sgr, from = trips$from, to = trips$to, demand = trips$demand)
head(aon)

##   from to cost  flow
## 1    1  2    6  3800
## 2    1  3    4  6000
## 3    2  1    6  3800
## 4    2  6    5  6600
## 5    3  1    4  6000
## 6    3  4    4 10200

g <- graph_from_data_frame(aon[,c("from", "to", "flow")], directed = TRUE)
c_scale <- colorRamp(c('#60AF66', 'yellow', 'red'))
maxflow <- max(c(aon$flow))

E(g)$color = apply(c_scale(aon$flow/maxflow), 1, function(x) rgb(x[1]/255,x[2]/255,x[3]/255) )
set.seed(5)
plot(g, edge.width = net$Capacity/2000, edge.arrow.size = 0, main = "AON assignment")
box(col = "black")

get_aon work also for contracted networks.

Choosing the best routing algorithm

Computing time is directly linked to shortest paths calculations. The user has the choice between two kind of routing algorithms, depending the sparsity of OD matrix. By default, algorithm argument is set to bidirectional Dijkstra. Just like get_*_pair functions, it run length(from) times routing algorithm for finding each OD pair’s shortest path. If OD matrix is dense, recursive one-to-many methods like Dijkstra algorithm would be preferred.
Given an origin node, bidirectional Dijkstra et NBA* algorithms are on average 2 and 5 times faster than Dijkstra algorithm, respectively.

This rule-of-thumb apply also when the network have been contracted using cpp_contract function. phast algorithm is for matrix-like shortest paths calculation, and bi is pairwise. In that case, bi algorithm is 200 times faster than phast for a given node.

For resuming :

matrix-like calculation : optimal for dense matrix, run one-to-many search for min(length(unique(from)), length(unique(to))) times,
pairwise calculation : optimal for highly sparse matrix, run one-to-one search for length(from) times.

User Equilibrium (UE)

The term “User Equilibrium” is used to describe a route choice assumption formally proposed by Wardrop :
“The journey times on all the routes actually used are equal and less than those which would be experienced by a single vehicle on any unused routes”.
Note that this principle follows directly from the assumptions that drivers choose minimum time paths, and are well-informed about network conditions.

Unlike AON assignment, this more realistic way to assign flows on a network take into account congestion effect. In this paradigm, the cost of a given link is dependent of the flow on it.

As an example, let’s assume 3000 commuters going from one node to another connected by two links. The User equilibrium is illustrated on this figure (source):

The relation between cost and flow is called volume decay function and is written as :

$t = t_{0}.(1+\alpha.(\frac{v}{c})^{\beta})$
with the cost, $t_{0}$ the free-flow cost, the flow (or volume), the link capacity and $\alpha, \beta$ unitless parameters.

Traffic Assignment Problem (TAP) is a convex optimization problem solved by iterative algorithms. The relative gap is the metric to minimize and is written as :

$gap=|(\frac{TSTT}{SPTT}) - 1|$

Total System Travel Time is written as $TSTT = \sum_{x \in E} v_{x}.t_{x}(v_{x})$
Shortest Path Travel Time is written as $SPTT = \sum_{x \in E} \hat{v}_{x}.t_{x}(v_{x})$
With the set of edge in the network, the volume or flow, the cost, $\hat{v}$ the flow estimated by All-or-Nothing assignment.

Link-based algorithms

These methods use the descent direction given by AON assignment at each iteration. All links are updated simultaneously using descent direction and a step size parameter $\theta$ .

Link-based algorithms implemented in cppRouting are, in increasing order of complexity :

Method of Successive Average (msa) : $\theta$ is defined as $\frac{1}{it}$ with the actual number of iteration.
Frank-Wolfe (fw) : $\theta$ is computed by minimizing the Beckmann function with bisection method.
Conjugate Frank-Wolfe (cfw) : $\theta$ is computed by minimizing the Beckmann function with bisection method. Descent direction is computed using AON assignment and direction at .
Bi-Conjugate Frank-Wolfe (bfw) : $\theta$ is computed by minimizing the Beckmann function with bisection method. Descent direction is computed using AON assignment and directions at and .

By going down through that list, we slightly increase computing time in each iteration but we will need less iterations to reach a given gap.
Let’s equilibrate traffic within our small network, first we need to construct the network with important parameters : capacity, alpha and beta.

For these algorithms, 99% of computation time is done within AON assignment.

sgr <- makegraph(df = net[,c("Init.node", "Term.node", "Free.Flow.Time")], 
                 directed = TRUE,
                 capacity = net$Capacity,
                 alpha = net$B,
                 beta = net$Power)


traffic <- assign_traffic(Graph = sgr,  from = trips$from, to = trips$to, demand = trips$demand, 
                          max_gap = 1e-6, algorithm = "bfw", verbose = FALSE)
  
traffic$gap

## [1] 0.0000009230227

Returned data contains the equilibrated network with the following edge attributes :

ftt : free-travel time i.e. the initial cost set during graph construction,
cost : actual travel time at equilibrium,
flow : equilibrated flow.

head(traffic$data)

##   from to ftt     cost      flow  capacity alpha beta
## 1    1  2   6 6.000779  4442.890 25900.201  0.15    4
## 2    1  3   4 4.008658  8111.525 23403.473  0.15    4
## 3    2  1   6 6.000829  4511.531 25900.201  0.15    4
## 4    2  6   5 6.545270  5940.297  4958.181  0.15    4
## 5    3  1   4 4.008369  8042.884 23403.473  0.15    4
## 6    3  4   4 4.262114 13910.673 17110.524  0.15    4

Now, we can plot the result with line size varying with road capacity and the color with flow.

par(mfrow=c(1,2),mar=c(3,0,3,0))
g <- graph_from_data_frame(aon[,c("from", "to", "flow")], directed = TRUE)
c_scale <- colorRamp(c('#60AF66', 'yellow', 'red'))
maxflow <- max(c(aon$flow, traffic$data$flow))

#Applying the color scale to edge weights.
#rgb method is to convert colors to a character vector.
E(g)$color = apply(c_scale(aon$flow/maxflow), 1, function(x) rgb(x[1]/255,x[2]/255,x[3]/255) )
set.seed(5)
plot(g, edge.width = traffic$data$capacity/2000, edge.arrow.size = 0, main = "AON assignment")
box(col = "black")


g <- graph_from_data_frame(traffic$data[,c("from", "to", "flow")], directed = TRUE)
E(g)$color = apply(c_scale(traffic$data$flow/maxflow), 1, function(x) rgb(x[1]/255,x[2]/255,x[3]/255) )
set.seed(5)
plot(g, edge.width = traffic$data$capacity/2000, edge.arrow.size = 0, main = "Flows at equilibrium")
box(col = "black")

Bush-based algorithms

cppRouting also propose a bush-based algorithm called Algorithm B from R.B. Dial (2006).
The problem is decomposed into sub-problems, corresponding to each origin of the OD matrix, that operate on acyclic sub-networks of the original transportation network, called bushes. Link flows are shifted from the longest path to the shortest path recursively within each bush using Newton method. Unlike link-based algorithm, Algorithm B can achieve very precise solution by minimizing relative gap down to 1e-16.

The main steps of the procedure are :

Initialization

bushes initialization : for each unique origin node of the OD matrix, an acyclic sub-network is created by computing shortest path tree from root node to all other nodes.

For each bush :

Topological ordering of the nodes
for each node, shortest path and longest path is computed and stored.
given an OD matrix, flows are loaded on each bush using shortest path labels.

Iteration
For each bush :

shortest and longest paths are updated
bush optimization : the bush is updated by removing unused links (no flow on it) and adding links leading to shorter paths. Network costs are updated in the same time.
bush equilibration : the flow is shifted from the shortest to the longest path using Newton method.

Evaluation
We compute AON assignment and relative gap.

For detailed explanations of Algorithm B, please see this course (part 1, part 2).

Important note : computation time for algorithm-B is depending of the number of origin node AND AON assignment.

Algorithm B is used by setting dial to algorithm argument :

traffic <- assign_traffic(Graph = sgr,  from = trips$from, to = trips$to, demand = trips$demand, 
                          max_gap = 1e-6, algorithm = "dial", verbose = TRUE)

## Bushes initialization...
## Iterating...
## iteration 1 : 0.05019
## iteration 2 : 0.0152742
## iteration 3 : 0.00363109
## iteration 4 : 0.000592523
## iteration 5 : 0.000106401
## iteration 6 : 3.00525e-05
## iteration 7 : 3.5758e-05
## iteration 8 : 1.21886e-05
## iteration 9 : 6.67853e-06
## iteration 10 : 1.659e-06
## iteration 11 : 2.34866e-07

traffic$gap

## [1] 0.000000234866

Performance comparison

Now, let’s measure performance of Traffic assignment algorithms on a larger road network. We use Chicago network, composed of 12982 nodes and 39018 edges.

net <- read.csv("chicagoregional_net.csv")
head(net)

##   from    to capacity length ftime    B power speed toll type
## 1    1 10293   100000   0.45     0 0.15     4    25    0    3
## 2    2 10294   100000   0.50     0 0.15     4    25    0    3
## 3    3 10295   100000   0.39     0 0.15     4    25    0    3
## 4    4 10296   100000   0.58     0 0.15     4    25    0    3
## 5    5 10297   100000   0.50     0 0.15     4    25    0    3
## 6    6 10298   100000   0.58     0 0.15     4    25    0    3

OD matrix contains 2,297,945 trips.

trips <- read.csv("chicagoregional_trips.csv")
head(trips)

##   from to demand
## 1    1  1  28.94
## 2    1  2  16.58
## 3    1  3   2.01
## 4    1  4  13.99
## 5    1  5   4.90
## 6    1  6   5.10

Since All-or-Nothing algorithm will be called multiple times, we have to choose the fastest AON assignment algorithm.

# Construct graph with link attributes
sgr <- makegraph(df = net[,c("from", "to", "ftime")], directed = TRUE,
                 capacity = net$capacity, alpha = net$B, beta = net$power)

# benchmark using all cores. We don't test NBA* because we don't have node coordinates
RcppParallel::setThreadOptions(numThreads = parallel::detectCores())

microbenchmark(
  pairwise = pw <- get_aon(Graph = sgr, from = trips$from, to = trips$to, demand = trips$demand, algorithm = "bi"),
  matrixlike = ml <- get_aon(Graph = sgr, from = trips$from, to = trips$to, demand = trips$demand, algorithm = "d"),
  times = 1
)

## Unit: seconds
##        expr       min        lq      mean    median        uq       max neval
##    pairwise 75.414786 75.414786 75.414786 75.414786 75.414786 75.414786     1
##  matrixlike  1.411425  1.411425  1.411425  1.411425  1.411425  1.411425     1

By setting matrix-like AON calculation, we speed up computation time by a factor of 50 for link-based methods.

sgr <- makegraph(df = net[,c("from", "to", "ftime")], directed = TRUE,
                 capacity = net$capacity, alpha = net$B, beta = net$power)

# we set 5 minutes for each algorithm (about 600 iterations for link-based, and 20 for algorithm-b)
## capture console output for plotting
outs <- list()
times <- list()
for (i in c("msa", "fw", "cfw", "bfw", "dial")){
  max_it <- 600
  if (i == "dial"){
    max_it <- 20
  }
  
  stdout <- vector(mode = "character")
  con    <- textConnection('stdout', 'wr', local = TRUE)
  sink(con)
  st <- system.time(
  traffic <- assign_traffic(Graph = sgr,  from = trips$from, to = trips$to, demand = trips$demand, 
                          max_gap = 1e-6, algorithm = i, aon_method = "d", verbose = TRUE, max_it = max_it)
  )
  sink()
  outs[[i]] <- stdout
  times[[i]] <- st
  rm(stdout)
}

# Plotting gap vs time

dfp <- list()
for (i in c("msa", "fw", "cfw", "bfw", "dial")){
  out <- outs[[i]]
  out <- out[grepl("iteration", out)]
  out <- sapply(strsplit(out, " : "), function(x) as.numeric(x[2]))
  
  df <- data.frame(gap = out, time = 1:length(out)  * (times[[i]][1]/length(out)), algo = i)
  dfp[[i]] <- df
}
dfp <- do.call(rbind, dfp)

options(scipen=999)
p <- ggplot(dfp, aes(x = time, y = gap, colour = algo))+
  geom_line(size = 1)+
  scale_y_log10()+
  theme_bw()+
  labs(x = "Time(second)", y = "Relative gap (log10 scale)", colour = "Algorithm")+
  scale_color_hue(labels = c("Bi-Conjugate FW", "Conjugate FW", "Algorithm B", "Frank-Wole", "MSA"))
  
p

Algorithm B could achieve very small gap within an reasonable amount of time, while link-based methods seems to converge much more slowly.
For achieving highly precise solutions, algorithm B should be implemented. However, algorithm B generally use more memory than link-based methods, depending network size and OD matrix. On the other hand, link-based method may have a higher benefit through parallel computing since computation time is mainly due to AON assignment.

Note : network can be contracted “on-the-fly” at each iteration to speed-up AON calculation, by setting cphast or cbi in aon_method argument.

Algorithm compatibility

Here is a table summarizing cppRouting functions compatibility with network nature (contracted, simplified, normal).

Functions	contracted	simplified
assign_traffic	no	yes
get_aon	yes	yes
get_detour	no	yes
get_distance_matrix - one weight	yes	yes
get_distance_matrix - two weights	yes	no
get_distance_pair - one weight	yes	yes
get_distance_pair - two weights	yes	no
get_isochrone	no	yes
get_multi_paths	no	yes
get_path_pair	yes	yes

Parallel implementation

Now a table summarizing cppRouting functions compatibility with parallel computing.

Functions	multithreaded
makegraph	no
cpp_contract	no
cpp_simplify	no
assign_traffic - link-based	yes
assign_traffic - bush-based	partly
get_aon	yes
get_detour	yes
get_distance_matrix	yes
get_distance_pair	yes
get_isochrone	yes
get_multi_paths	yes
get_path_pair	yes

Applications

Except application 4, all indicators are calculated at the country scale but for the limited R’s ability to plot large shapefile, only one region is mapped.

Application 1 : Calculate Two Step Floating Catchment Areas (2SFCA) of general practitioners in France

2SFCA method is explained here : https://en.wikipedia.org/wiki/Two-step_floating_catchment_area_method

First step
Isochrones are calculated with the cppRouting function get_isochrone

#Isochrone around doctor locations with time limit of 15 minutes
iso<-get_isochrone(graph,
                   from = ndcom[ndcom$com %in% med$CODGEO,"id_noeud"],
                   lim = 15,
                   keep=ndcom$id_noeud, #return only commune nodes
                   long = TRUE) #data.frame output
#Joining and summing population located in each isochrone
iso<-left_join(iso,ndcom[,c("id_noeud","POPULATION")],by=c("node"="id_noeud"))
df<-iso %>% group_by(origin) %>%
  summarise(pop=sum(POPULATION))
#Joining number of doctors 
df<-left_join(df,med[,c("id_noeud","NB_D201")],by=c("origin"="id_noeud"))
#Calculate ratios
df$ratio<-df$NB_D201/df$pop

Second step

#Isochrone around each commune with time limit of 15 minutes (few seconds to compute)
iso2<-get_isochrone(graph,
                    from=ndcom$id_noeud,
                    lim = 15,
                    keep=ndcom$id_noeud,
                    long=TRUE)
#Joining and summing ratios calculated in first step
df2<-left_join(iso2,df[,c("origin","ratio")],by=c("node"="origin"))
df2<-df2 %>% group_by(origin) %>%
  summarise(sfca=sum(ratio,na.rm=T))

Plot the map for Bourgogne-Franche-Comte region

#Joining commune IDs to nodes
df2<-left_join(df2,ndcom[,c("id_noeud","com")],by=c("origin"="id_noeud"))
#Joining 2SFCA to shapefile
com<-left_join(com,df2[,c("com","sfca")],by=c("INSEE_COM"="com"))
#Plot for one region
p<-tm_shape(com[com$NOM_REG=="BOURGOGNE-FRANCHE-COMTE",]) + 
  tm_fill(col = "sfca",style="cont")+
  tm_layout(main.title="2SFCA applied to general practitioners",legend.outside=TRUE)
p

Application 2 : Calculate the minimum travel time to the closest maternity ward in France

Shortest travel time matrix
The shortest travel time is computed with the cppRouting function get_distance_matrix. In order to compute multiple distances from one source, original uni-directional Dijkstra algorithm is ran without early stopping.
We compute travel time from all commune nodes to all maternity ward nodes (i.e ~36000*400 distances).

#Distance matrix on contracted graph
dists<-get_distance_matrix(graph3,
                           from=ndcom$id_noeud,
                           to=ndcom$id_noeud[ndcom$com %in% maternity$CODGEO],
                           algorithm = "phast")#because of the rectangular shape of the matrix
#We extract each minimum travel time for all the communes
dists2<-data.frame(node=ndcom$id_noeud,mindist=apply(dists,1,min,na.rm=T))
#Joining commune IDs to nodes
dists2<-left_join(dists2,ndcom[,c("id_noeud","com")],by=c("node"="id_noeud"))
#Joining minimum travel time to the shapefile
com<-left_join(com,dists2[,c("com","mindist")],by=c("INSEE_COM"="com"))

Plot the map of minimum travel time in Bourgogne-Franche-Comte region

p<-tm_shape(com[com$NOM_REG=="BOURGOGNE-FRANCHE-COMTE",]) + 
  tm_fill(col = "mindist",style="cont",palette="Greens",title="Minutes")+
  tm_layout(main.title="Travel time to the closest maternity ward",legend.outside=T)
p

Application 3 : Calculate average commuting time to go to job in France

Commuting data from national census is composed of 968794 unique pairs of origin - destination locations (home municipality, job municipality). Using other R packages like igraph or dodgr, we would have to calculate the whole matrix between all communes (36000 x 36000). We would end with a 10Gb matrix whereas we only need 0.075% of the result (968794/(36000 x 36000)).
So this example illustrate the advantage of calculating distance by pair.

#Merge node to communes
ndcom$id_noeud<-as.character(ndcom$id_noeud)
cmt$node1<-ndcom$id_noeud[match(cmt$CODGEO,ndcom$com)]
cmt$node2<-ndcom$id_noeud[match(cmt$DCLT,ndcom$com)]
cmt<-cmt[!is.na(cmt$node1) & !is.na(cmt$node2),]

#Calculate distance for each pair using contracted graph
dist<-get_distance_pair(graph3,from=cmt$node1,to=cmt$node2)

#Mean weighted by the flow of each pair
traveltime<-cbind(cmt,dist)
traveltime<-traveltime %>% group_by(CODGEO) %>% summarise(time=weighted.mean(dist,flux,na.rm=T))

Plot the map of average travel time of Bourgogne-Franche-Comte inhabitants

#Merge to shapefile
com<-left_join(com,traveltime,by=c("INSEE_COM"="CODGEO"))

p<-tm_shape(com[com$NOM_REG=="BOURGOGNE-FRANCHE-COMTE",]) +
  tm_fill(col = "time",style="quantile",n=5,title="Minutes",palette="Reds")+
  tm_layout(main.title="Average travel time",legend.outside=TRUE)
p

Application 4 : Calculate the flow of people crossing each municipality in the context of commuting in Bourgogne-Franche-Comte region

First, we must determine in which commune each node is located.

#Convert nodes coordinates to spatial points
pts<-st_as_sf(coord,coords = c("X","Y"),crs=2154)
st_crs(com)<-st_crs(pts)

## Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
## that

#Spatial join commune ID to each node
spjoin<-st_join(pts,com[,"INSEE_COM"])
st_geometry(spjoin)<-NULL
spjoin$ID<-as.character(spjoin$ID)
#Vector of commune IDs of Bourgogne-Franche-Comte region
comBourg<-com$INSEE_COM[com$NOM_REG=="BOURGOGNE-FRANCHE-COMTE"]
#Calculate shortest paths for all commuters on the contracted graph
shortPath<-get_path_pair(graph3,
                         from=cmt$node1,
                         to=cmt$node2,
                         keep = spjoin$ID[spjoin$INSEE_COM %in% comBourg],#return only the nodes in Bourgogne Franche-Comte because of memory usage
                         long = TRUE)#return a long data.frame instead of a list

#Joining commune ID to each traveled node
shortPath<-left_join(shortPath,spjoin,by=c("node"="ID"))

If a commuter crosses multiple nodes in a municipality, we count it only once.

#Remove duplicated 
shortPath<-shortPath[!duplicated(shortPath[,c("from","to","INSEE_COM")]),]

#Joining flow per each commuter
shortPath<-left_join(shortPath,
                     cmt[,c("node1","node2","flux")],
                     by=c("from"="node1","to"="node2"))

#Aggregating flows by each commune
shortPath<-shortPath %>% group_by(INSEE_COM) %>% summarise(flow=sum(flux))

**Plot the flow of people crossing Bourgogne-Franche-Comte’s communes **

#Merge to shapefile
com<-left_join(com,shortPath,by="INSEE_COM")

p<-tm_shape(com[com$NOM_REG=="BOURGOGNE-FRANCHE-COMTE",]) + 
  tm_fill(col = "flow",style="quantile",n=10,palette="Blues")+
  tm_layout(main.title="Commuters flow",legend.outside=TRUE)
p

Benchmark with other R packages

To show the efficiency of cppRouting, we can make some benchmarking with the famous R package igraph, and the dodgr package. Estimation of the User Equilibrium cannot be compared because of the absence of this kind of algorithm within R landscape (to my knowledge).

Distance matrix : one core

library(igraph)
library(dodgr)
#Sampling 1000 random origin/destination nodes (1000000 distances to compute)
set.seed(10)
origin<-sample(unique(roads$from),1000,replace = F)
destination<-sample(unique(roads$from),1000,replace = F)

RcppParallel::setThreadOptions(1)
#igraph graph
graph_igraph<-graph_from_data_frame(roads,directed = TRUE)

#dodgr graph

roads2<-roads
colnames(roads2)[3]<-"dist"

#benchmark
microbenchmark(igraph=test_igraph<-distances(graph_igraph,origin,to=destination,weights = E(graph_igraph)$weight,mode="out"),
               dodgr=test_dodgr<-dodgr_dists(graph=roads2,from=origin,to=destination),
               cpprouting=test_cpp<-get_distance_matrix(graph,origin,destination),
               contr_mch=test_mch<-get_distance_matrix(graph3,origin,destination,algorithm = "mch"),
               contr_phast=test_phast<-get_distance_matrix(graph3,origin,destination,algorithm = "phast"),
               times=1,
               unit = 's')

## Unit: seconds
##         expr       min        lq      mean    median        uq       max neval
##       igraph 64.924762 64.924762 64.924762 64.924762 64.924762 64.924762     1
##        dodgr 79.129622 79.129622 79.129622 79.129622 79.129622 79.129622     1
##   cpprouting 28.453509 28.453509 28.453509 28.453509 28.453509 28.453509     1
##    contr_mch  0.637912  0.637912  0.637912  0.637912  0.637912  0.637912     1
##  contr_phast  4.360011  4.360011  4.360011  4.360011  4.360011  4.360011     1

Even if we add the preprocessing time (i.e. 14s) to the query time, the whole process of contraction hierarchies is still faster.

Ouput

head(cbind(test_igraph[,1],test_dodgr[,1],test_cpp[,1],test_mch[,1],test_phast[,1]))

##            [,1]     [,2]     [,3]     [,4]     [,5]
## 151604 442.6753 442.6753 442.6753 442.6753 442.6753
## 111317 415.5493 415.6265 415.5493 415.5493 415.5493
## 212122 344.0450 344.0450 344.0450 344.0450 344.0450
## 215162 340.6700 340.6700 340.6700 340.6700 340.6700
## 159720 385.1082 385.1082 385.1082 385.1082 385.1082
## 168762 513.4273 513.5044 513.4273 513.4273 513.4273

All-or-Nothing assignment : sparse OD matrix

#generate random flows
flows <- sample(10:1000, 1000, replace = TRUE)
flows2 <- diag(flows, nrow = 1000, ncol = 1000)

microbenchmark(dodgr=test_dodgr<-dodgr_flows_aggregate(graph = roads2, from = origin, to = destination, flows = flows2, contract = FALSE, norm_sums = FALSE, tol = 0),
               cpprouting=test_cpp <- get_aon(graph, from = origin, to = destination, demand = flows, algorithm = "bi"),
               cpprouting_ch=test_ch <- get_aon(graph3, from = origin, to = destination, demand = flows, algorithm = "bi"),
               times=1,
               unit = 's')

## Unit: seconds
##           expr        min         lq       mean     median         uq
##          dodgr 39.9192510 39.9192510 39.9192510 39.9192510 39.9192510
##     cpprouting 11.6453517 11.6453517 11.6453517 11.6453517 11.6453517
##  cpprouting_ch  0.6071558  0.6071558  0.6071558  0.6071558  0.6071558
##         max neval
##  39.9192510     1
##  11.6453517     1
##   0.6071558     1

All-or-Nothing assignment : dense OD matrix

#generate random flows
flows <- sample(10:1000, 1000000, replace = TRUE)
flows2 <- matrix(flows, nrow = 1000, ncol = 1000)
df_flows <- as.data.frame(flows2)
colnames(df_flows) <- destination
df_flows$from <- origin
df_flows <- reshape2::melt(df_flows, id.vars = "from", variable.name = "to")

microbenchmark(dodgr=test_dodgr<-dodgr_flows_aggregate(graph = roads2, from = origin, to = destination, flows = flows2, contract = FALSE, norm_sums = FALSE, tol = 0),
               cpprouting=test_cpp <- get_aon(graph, from = df_flows$from, to = df_flows$to, demand = df_flows$value, algorithm = "d"),
               cpprouting_ch=test_ch <- get_aon(graph3, from = df_flows$from, to = df_flows$to, demand = df_flows$value, algorithm = "phast"),
               times=1,
               unit = 's')

## Unit: seconds
##           expr      min       lq     mean   median       uq      max neval
##          dodgr 98.68688 98.68688 98.68688 98.68688 98.68688 98.68688     1
##     cpprouting 31.16458 31.16458 31.16458 31.16458 31.16458 31.16458     1
##  cpprouting_ch  5.91544  5.91544  5.91544  5.91544  5.91544  5.91544     1

Citation

Please don’t forget to cite cppRouting package in your work !

citation("cppRouting")

cpprouting's People

Contributors

Stargazers

Watchers

Forkers

imarcello trafficonese phewson burikinc jacobuss kdtiankong zowerlopp mobilyticsin tongjt

cpprouting's Issues

Apache Arrow?

Hi, didn't want to barge in on #12 , so thought I would raise this as a separate issue. Have you considered using an Arrow table to hold your graph object rather than a base R data.frame? In addition to the better memory management, there are other advantages to going the Arrow route, like storing metadata. If this is something you think is worthwhile, I would be happy to help implement at least the R side of it - afraid the C++ side of things is a little bit beyond me at the moment.

I'm coming at this from ecology, where the conversion from a grid to a graph can be prohibitively costly in terms of memory.

Also, a somewhat related question: why did you choose to represent the numeric vertex IDs as character strings? Is there some advantage to this in your C++ code?

This is a great package, btw. Thanks!

Data_preprocessing do not output coordinates.csv file

Hi,
I was able to create roads.scv file from my own network.
however how do I create a coordinates.csv file?

Fantastic library by the way.

Dijkstra algorithm in get_path_pair

Hey,

First off, thank you for this useful and efficient package.

I was using get_path_pair in a simulation, using the Dijkstra algorithm to get the shortest path between an origin and destination. I recently installed this package on a remote computer to run the simulation, and it no longer works as expected. Now, when I run with the same code in the new installed version, my result is this:

1 new("standardGeneric", .Data = function (x) 2 standardGeneric("res"), generic = structure("res", package = "terra"), 3 package = "terra", group = list(), valueClass = character(0), 4 signature = "x", default = NULL, skeleton = (function (x) 5 stop("invalid call in method dispatch to 'res' (no default method)", 6 domain = NA))(x))

and this is the function used to get that result:

spp=get_path_pair(gr,cno,cnd,algorithm="Dijkstra")

I think what's maybe happening is that it isn't assigning anything to res, the returned object, because I'm calling an algorithm name that isn't there anymore in the most recent commit? I noticed there is only 'bi' and 'nbd' now in the function. However, I tried the 'bi' option and I get the same result. Any thoughts on what is going on here?

Thank you,
Kayleigh

Network pre-processing instructions

After conversations in the Geocomputation with R issue tracker, and thanks to your input @vlarmet, I'm keen to give the package a try. However, I have no idea how to get started, beginning with just an OSM dataset, e.g. downloaded and imported into R with osmextract.

Here's a minimal reproducible example:

library(osmextract)
osm_lines = oe_get("Isle of Wight", stringsAsFactors = FALSE, quiet = TRUE)
osm_points = oe_get("Isle of Wight", layer = "points", stringsAsFactors = FALSE, quiet = TRUE)
nrow(osm_lines)
#> [1] 48281
nrow(osm_points)
#> [1] 61329
par(mar = rep(0, 4))
plot(st_geometry(osm_lines), xlim = c(-1.59, -1.1), ylim = c(50.5, 50.8))
plot(st_geometry(osm_points), xlim = c(-1.59, -1.1), ylim = c(50.5, 50.8))

I'm keen to contribute to this package if that would be helpful, could well be useful for my work...

Compiling and Debugging cppRouting's NBA Function for Algorithm Analysis

I'm writing because I have a question about using cppRouting.
When I enter the command trajet <- get_path_pair(graph,from="32",to="6",algorithm="NBA"), I assume that the pathPair::nba() function in path_pair.cpp is called. Is this correct?
I'm trying to study how the algorithm works by adding code to pathPair::nba() in path_pair.cpp to print arguments to a txt file. After modifying path_pair.cpp, I tried running the command trajet <- get_path_pair(graph,from="32",to="6",algorithm="NBA"), but no txt file was created.
I think I need to compile it, but I'm new to R packages (and RStudio as well), so I'm wondering if there's a way to compile it or set a breakpoint to see how the code executes line by line. Thank you.

Enable lon/lat coordinates?

This package is brilliant & I use it all the time (mainly for calculating the shortest distances between locations). This is a feature request - it would be really useful if the get_distance_*() functions in the package could handle longitude and latitude coordinates. I appreciate that you can project lon/lat coordinates, but this can induce distortions which are undesirable & cause issues around the International Date Line. Ideally, it would be possible to supply a data frame with from/to/cost to makegraph(), where cost represents geodesic distances, with lon/lat coords supplied to the coords argument of the same function, and then the package (e.g., get_distance_matrix()) would correctly compute shortest distances between selected nodes. Is this possible?

Uniform params for get_multi_paths

Hi all, I think wouldbe great, just to have an easier way to use the functions, adjust the params of get_multi_paths for the same as the others.

get_multi_paths<-function(Graph, from ,to ,keep=NULL, long=FALSE)

Ideally add the algorithm param, with the one that supports:
get_multi_paths<-function(Graph, from ,to ,keep=NULL, long=FALSE, algorithm = "Dijkstra")

Thx!

Multiple auxiliary edge weights

Thank you for this incredible work !

I would like to know if it could be possible to have simultaneously multiple auxiliary edge weights. It could be passed to the function makegraph as a matrix for example.

This would be very helpfull because we would not have to recalculate several times the same itineraries when we have multiple auxiliary edge weights.

All the best !

Not all shortest paths returned by get_multi_paths

I've noticed some surprising behaviour in get_multi_paths:

If I use an undirected graph:

Represented by adjacency matrix:

library(igraph)

demoGraphMat =
  matrix(c(0, 1, 0, 0, 1, 0,
           1, 0, 1, 0, 1, 0,
           0, 1, 0, 1, 0, 0,
           0, 0, 1, 0, 1, 1,
           1, 1, 0, 1, 0, 0,
           0, 0, 0, 1, 0, 0), 
         nrow = 6)

demoGraph =  as.undirected(graph_from_adjacency_matrix(demoGraphMat)) 
V(demoGraph)$name = LETTERS[1:vcount(demoGraph)]

Then get_multi_paths does not return all of the paths between nodes B and F:

library(cppRouting)

edgeList =
  demoGraph |>
    igraph::as_data_frame(what = 'edges') |>
    dplyr::mutate(cost = 1) 

nodes = unique(c(edgeList$from, edgeList$to))

graph = makegraph(edgeList, directed = FALSE)

allPathsList =
  get_multi_paths(Graph=graph, 
                               from=nodes, 
                               to=nodes, 
                               long = FALSE) 

allPathsList$B

$B
$B$A
[1] "A" "B"

$B$B
character(0)

$B$C
[1] "C" "B"

$B$D
[1] "D" "C" "B"

$B$E
[1] "E" "B"

$B$F
[1] "F" "D" "C" "B"

There are two degenerate paths from B to F: one via C, which is reported, and one via E, which is not.

Is this a bug or have I used the tool incorrectly? Or assumed that it should return all paths when it wont?

About Contraction hierarchies

Hi, I was wondering from a lot of time, I can't get, if the contraction hierarchies is completely safe, just to know, that method alters the actual nodes? or it keeps all the graph nodes?

Example of using osm2po for osm data extraction

I'm using cppRouting for some work I'm doing with openstreetmaps but I can't figure out how you extracted the data using osm2po. So far I'm creating the graph myself by importing the data with osmextract, and then calculating the distances between vertices but this is very slow for large files.

Could you provide an example of how you did this with osm2po?

change direction of get_path_pair output

Hi,

first of all, thanks for the awesome package. Super fast implementations.
I came accross a mileading implementation of the get_path_pair function.
It seems as if the visited nodes are listed backwards. See:

graph<-data.frame(from=c(0,1,2,3,4,2),
                  to=c(1,2,3,0,2,4),
                  weight=c(1,1,1,1,1,1))


cpp_graph<-makegraph(graph,directed=TRUE)
get_path_pair(cpp_graph,from="0",to="3")

output:

$`0_3`
[1] "3" "2" "1" "0"

with graph example from: https://stackoverflow.com/questions/60784024/djikstra-method-using-cpprouting-package-in-r

I think a note in the user manual would be helpful.

Thanks!

Lukas

Performance matrix vs pairs

Hi, this is just a tricky question, I would like to know..., if there is a difference between the matrix function and pairs functions, like the shortest path or the distance one. Is there any difference in the performance? I can always do a `` èxpand.grid``` the nodes from a matrix and do it in the pair function, the actual functions (matrix and pair) are helpers to run in a low level function, or they have indeed code to have better performance.

The only thing.., I can think if there is no performance, maybe, is the cost of store the expanded matrix in pairs, some of them can be very heavy in pairs.

Thx!

Python package !

All cores not working properly

Hi, I has tryied to use the allcores, but then I notice it, is not using all my cores D:

Well, maybe technically is using it, but every core is used half of it..., I don't know how it works, but would be great if the option get the best performance over all cores.

Thx!

get_isochrone - return traveltime

Hi,
would it be possible to implement an option for the get_isochrone() function so that it returns the duration (or cost in general) from origin to node. As of now it only returns the limit.

makegraph - Rcpp_precious_remove error

I'm using v2.0.0 of cppRouting and the example for makegraph gives an error

#Data describing edges of the graph
edges<-data.frame(from_vertex=c(0,0,1,1,2,2,3,4,4), 
                  to_vertex=c(1,3,2,4,4,5,1,3,5), 
                  cost=c(9,2,11,3,5,12,4,1,6))

#Construct directed and undirected graph 
directed_graph<-makegraph(edges,directed=TRUE)

> Error in Remove_duplicate(df[, 1], df[, 2], df[, 3], length(Nodes)) : 
>   function 'Rcpp_precious_remove' not provided by package 'Rcpp'

─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting  value                       
version  R version 4.1.0 (2021-05-18)
os       macOS Big Sur 11.3.1        
system   aarch64, darwin20           
ui       RStudio                     
language (EN)                        
collate  en_AU.UTF-8                 
ctype    en_AU.UTF-8                 
tz       Australia/Melbourne         
date     2021-08-04                  

─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
package      * version date       lib source        
cppRouting   * 2.0     2020-01-07 [1] CRAN (R 4.1.0)
data.table   * 1.14.0  2021-02-21 [1] CRAN (R 4.1.0)
Rcpp           1.0.6   2021-01-15 [1] CRAN (R 4.1.0)
RcppParallel   5.1.4   2021-05-04 [1] CRAN (R 4.1.0)
RcppProgress   0.4.2   2020-02-06 [1] CRAN (R 4.1.0)

I also get an occasional crash / abort of RStudio which suggests there's an index out of range issue somewhere, but I can't track this down*.

This just happend when running the examples for cppRouting:: get_path_pair()

feature suggestions

Hello Vincent,

This is not an issue, as I haven't had any serious ones using cppRouring, it's actually phenomenal how good it is. We are publishing a paper soon in which all the routing was done with it. In short, thank you for your work ! I have some feature suggestions and I would be happy to contribute to them if you can use an extra pair of hands. The first one is being able to rout by giving geographic coordinates of points, instead of node ids. This would be very handy, I currently made my own function for that using h3 indexing, and if the data gets very big I run a nearest feature search in a postgis database. I am sure it can be sped up in R with some cpp code specific to the package. Having a cppRouting function do it as fast as a database in R would be truly great.
The second feature I would be happy to use and contribute to is a 'network voronoi’ function that from a set of input nodes/coordinates returns a list of sets of nodes from the graph that are nearest to them. Let me know if these seem relevant.

Looking forward to your reply,
Best,
Ivann

get_multi_paths fails to return the right value between the same nodes

Hi, I was using the shortest path function and I found something that seems to be a bug, usually the results are catalogued in the same way:

If the shortestpath does not exists because are in different components, return logical(0)
if is requested from the same node to the same node, like from x to x, it returns a vector like c(x)
if is requested from/to different nodes, a vector with their respective nodes

In get_multi_paths if we request node x to node x, we are getting logical(0) instead c(c), this is pretty important in the moment to recognize if the path exists or not.

library(wrapr)

network <- local({
  nodes <- sf::st_sf(
    geom = sf::st_sfc(
      sf::st_point(c(0, 0)),
      sf::st_point(c(0, 1))
    ),
    crs = CITLib::sf_planar_crs
  )

  edges <- sf::st_sf(
    geom = sf::st_sfc(
      sf::st_linestring(c(
        sf::st_point(c(0, 0)),
        sf::st_point(c(0, 1))
      )),
      crs = CITLib::sf_planar_crs
    ),
    from = c(1),
    to = c(2)
  )

  edges$weight <- sf::st_length(sf::st_geometry(edges))

  sfnetworks::sfnetwork(nodes, edges, directed = FALSE)
})

weight <- "weight"

cppRouting_graph <- cppRouting::makegraph(
  network %.>%
    sfnetworks::activate(., "edges") %.>%
    sf::st_as_sf(.) %.>%
    sf::st_drop_geometry(.) %.>%
    as.data.frame(.) %.>%
    dplyr::transmute(., .data$from, .data$to, cost = .data[[weight]]),
  directed = igraph::is_directed(network)
)

ret <- cppRouting::get_multi_paths(Graph = cppRouting_graph, from = c(1), to = c(1))
ret
$`1`
$`1`$`1`
character(0)

Thx!

Unique nodes: makegraph only checks x coordinates, and not y coordinates

Hello and thanks for the great package on network analysis!

I am going through your tutorial with my own data, and stumbled upon the following error:
"Nodes should be unique in the coordinates data frame"

I have checked my data, and I don't think I have duplicate nodes. The function makegraph throws an error if two nodes have identical x coordinates, but does not check if the y coordinates are different.

I would argue that one should replace the following line in the function definition:
sum(duplicated(coords[, 1])) > 0
with:
sum(duplicated(coords[, c(1,2)])) > 0

What do you think? Am I maybe missing something?

warning on some graphs

Hi, I was running some tests in a lib, and I found the next warning which seems to be new:

Warning message:
In cbind(from, to) :
  number of rows of result is not a multiple of vector length (arg 1)

I does not know how/why happens, here a reprex:

library(wrapr)

sfnet2cppRouting <- function(
    network,
    weight) {
  cppRouting_graph <- cppRouting::makegraph(
    network %.>%
      sfnetworks::activate(., "edges") %.>%
      sf::st_as_sf(.) %.>%
      sf::st_drop_geometry(.) %.>%
      as.data.frame(.) %.>%
      dplyr::transmute(., .data$from, .data$to, cost = .data[[weight]]),
    directed = igraph::is_directed(network)
  )
  # }
  original_nodes <- network %.>%
    sfnetworks::activate(., "nodes") %.>%
    sf::st_as_sf(.)
  if (nrow(original_nodes) != cppRouting_graph$nbnode) {
    stop("The net was simplified by cppRouting, the nodes assignation can be broken.")
  }
  cppRouting_graph
}

network <- local({
  file <- "net_flat_5_sfnet-0.6.3.gpkg"

  sfnetworks::sfnetwork(
    sf::st_read(file, layer = "nodes", quiet = TRUE),
    sf::st_read(file, layer = "edges", quiet = TRUE),
    directed = FALSE
  )
})


ret <- cppRouting::get_multi_paths(
  Graph = sfnet2cppRouting(network, "weight"),
  from = c(1, 2, 3),
  to = c(5, 6, 7, 8)
)
Warning message:
In cbind(from, to) :
  number of rows of result is not a multiple of vector length (arg 1)

Network used:
net_flat_5_sfnet-0.6.3.zip

Thx!

R version 4.3.2 (2023-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Gentoo Linux

Matrix products: default
BLAS: /usr/lib64/libblas.so.3.11.0
LAPACK: /usr/lib64/liblapack.so.3.11.0

locale:
[1] LC_CTYPE=es_CL.utf8 LC_NUMERIC=C
[3] LC_TIME=es_CL.utf8 LC_COLLATE=es_CL.utf8
[5] LC_MONETARY=es_CL.utf8 LC_MESSAGES=es_CL.utf8
[7] LC_PAPER=es_CL.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_CL.utf8 LC_IDENTIFICATION=C

time zone: Chile/Continental
tzcode source: system (glibc)

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] wrapr_2.1.0

loaded via a namespace (and not attached):
[1] crayon_1.5.2 vctrs_0.6.5 sfnetworks_0.6.3 cli_3.6.2
[5] rlang_1.1.2 DBI_1.1.3 KernSmooth_2.23-22 purrr_1.0.2
[9] generics_0.1.3 sf_1.0-15 RcppParallel_5.1.7 glue_1.6.2
[13] colorspace_2.1-0 cppRouting_3.1 lwgeom_0.2-13 e1071_1.7-14
[17] scales_1.3.0 fansi_1.0.6 sfheaders_0.4.3 grid_4.3.2
[21] munsell_0.5.0 classInt_0.4-10 tibble_3.2.1 lifecycle_1.0.4
[25] compiler_4.3.2 dplyr_1.1.4 igraph_1.6.0 Rcpp_1.0.11
[29] pkgconfig_2.0.3 tidyr_1.3.0 R6_2.5.1 class_7.3-22
[33] tidyselect_1.2.0 utf8_1.2.4 pillar_1.9.0 magrittr_2.0.3
[37] tidygraph_1.2.3 gtable_0.3.4 tools_4.3.2 proxy_0.4-27
[41] ggplot2_3.4.4 units_0.8-5

Support sfnetworks

Hi all, actually cppRouting has a lot of tools to internally handle some changes like contraction hierarchies.

At the same time, rn sfnetworks is a popular and very useful library to work on, I think would be a good idea have a function to convert sfnetwork graph to cpprouting graph.

I already have a function for it, but would be good to have for it some extra tests and I don't fully understand all the cppRouting graph structure to construct them for every case that may be of interestest.

The function I have:

sfnet2cppRouting <- ? function(
    network = NULL ? geotypes::sfnetworks_sfnetwork(null_ok = FALSE),
    weight = NULL ? typed::Character(1, anyNA = FALSE, null_ok = FALSE)) {
  cppRouting_graph <- cppRouting::makegraph(
    network %.>%
      sfnetworks::activate(., "edges") %.>%
      sf::st_as_sf(.) %.>%
      sf::st_drop_geometry(.) %.>%
      as.data.frame(.) %.>%
      dplyr::transmute(., .data$from, .data$to, cost = .data[[weight]]),
    directed = igraph::is_directed(network)
  )
  # }
  original_nodes <- network %.>%
    sfnetworks::activate(., "nodes") %.>%
    sf::st_as_sf(.)
  if (nrow(original_nodes) != cppRouting_graph$nbnode) {
    stop("The net was simplified by cppRouting, the nodes assignation can be broken.")
  }
  cppRouting_graph
}

Personally, the latest section, the stop is a need, happened, very low times, that sadly I was not able to reproduce, some nodes disappear using makegraph, so it would helps to detect that cases, has been some time from when that happened, maybe is fixed, sadly I was not able to found any issue here to check it.

Thx!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

vlarmet / cpprouting Goto Github PK

cpprouting's Introduction

Support Our Open Source Project

cppRouting v3 : Algorithms for Routing and Solving the Traffic Assignment Problem

Package presentation

Installation

Stable version from CRAN

or from github

What are we talking about ?

Readme Data

Head of road network data

Head of coordinates data

Set the number of threads used by cppRouting

Instantiate the graph

Main functions

Routing

Algorithms

Output

Compute isochrones

Compute possible detours within a fixed additional cost

Contraction hierarchies

Compare outputs

Performance comparison

Distance by pair

Distance matrix

Work with dual weighted network

Network simplification

French road network simplification

Compare outputs

Running time

Traffic assignment

All-or-Nothing (AON)

Choosing the best routing algorithm

User Equilibrium (UE)

Link-based algorithms

Bush-based algorithms

Performance comparison

Algorithm compatibility

Parallel implementation

Applications

Application 1 : Calculate Two Step Floating Catchment Areas (2SFCA) of general practitioners in France

Application 2 : Calculate the minimum travel time to the closest maternity ward in France

Application 3 : Calculate average commuting time to go to job in France

Application 4 : Calculate the flow of people crossing each municipality in the context of commuting in Bourgogne-Franche-Comte region

Benchmark with other R packages

Citation

cpprouting's People

Contributors

Stargazers

Watchers

Forkers

cpprouting's Issues

Recommend Projects

Recommend Topics

Recommend Org

Set the number of threads used by `cppRouting`