Nice. Very light-weight compared to proper local routers like Graphhopper, OSRM, etc., which can be overkill for simple tasks. Although the 'routing' here is nx.shortest_path, which is just Dijkstra, so pretty slow compared to other easy to implement routing algorithms (even just bi-directional Dijkstra or A*... although contraction hierarchies would be huge gain here since edge weights are fixed). Also not sure why readme describes it as an approximation? Dijkstra is guaranteed to return lowest cost path. Maybe approximation because assuming free-flow, or if the NAR dataset is incomplete?
Thx for the heads up on optimizations available. The “Approximations” comment does not apply to the shortest path calculation, but rather to the distances and upper bound times estimations. This is the consequence of enabling routing for points that dont exist as nodes (closest node approximation).
> Although the 'routing' here is nx.shortest_path, which is just Dijkstra, so pretty slow compared to other easy to implement routing algorithms
networkx has advantages of being popular, well-documented, pure python (less hassle to maintain) with code that is easy to read and modify. but, one big downside of being pure python means that it also has fundamentally poor performance: it can't use a cpu efficiently, the way the graphs are represented also means it can't use memory, memory bandwidth or cache efficiently either.
orthogonally from switching the search algorithm, one quick way to potentially get a large speedup is try swapping out networkx for rustworkx (or any other graph library with python bindings that has native implementations of data structures and graph algorithms)
another thing to check would be to avoid storing auxiliary node/edge attributes in the graph that aren't necessary during search, so that cache and memory bandwidth can be focused on node indices and edge weights.
I went down a rabbit hole playing around with this some years ago (using Cython not rust). Relatively simple things like "store the graph in an array-oriented way (CSC/CSR sparse matrix format or similar)" and "eliminate all memory allocation and pure python code from the Dijkstra search, replace it with simple C code using indices into preallocated arrays" gets you pretty far. It is possible to get further performance increases by reviewing and tweaking the search code to avoid unnecessary branches, investigating variants of the priority queue used to maintain partial paths by path distance (i found switching the heap queue from a binary tree to a 4-ary tree gave a 30% reduction in running time), seeing if the nodes of the graph can be reindexed so that nodes with similar indices are spatially similar and more likely to be in cache (another 30% or so reduction in running time from Hilbert curve ordering). Some of this will be quite problem and data dependent and not necessarily a good tradeoff for other graphs. All up I got around a 30x speedup vs baseline networkx for dijkstra searches to compute path distances to all nodes from a few source nodes on a street network graph with 3.6m nodes & 3.8m edges (big enough not to fit in L3 cache for the CPU i was running experiments with).