Code Monkey home page Code Monkey logo

googlepolylines's Introduction

R build status CRAN_Status_Badge Codecov test coverage downloads

googlePolylines

A fast and light-weight implementation of Google's polyline encoding algorithm.

Polyline encoding is a lossy compression algorithm that allows you to store a series of coordinates as a single string.

Installation

From CRAN

install.packages("googlePolylines")

From github (dev version)

remotes::install_github("SymbolixAU/googlePolylines")

Scope

Because googlePolylines uses Google's polyline encoding algorithm, all functions assume Google Web Mercator projection (WSG 84 / EPSG:3857 / EPSG:900913) for inputs and outputs. Objects that use other projections should be re-projected into EPSG:3857 before using these functions.

googlePolylines supports Simple Feature objects (from library(sf)), data.frames, and vectors of lon/lat coordinates.

Supported sf types

  • POINT
  • MULTIPOINT
  • LINESTRING
  • MULTILINESTRING
  • POLYGON
  • MULTIPOLYGON
  • GEOMETRY

Examples

googlePolylines contains functions to encode coordinates into polylines, and also to parse polylines to and from well-known text format.

encode

library(googlePolylines)
library(sf)

# create data

df <- data.frame(myId = c(1,1,1,1,1,1,1,1,2,2,2,2),
				lineId = c(1,1,1,1,2,2,2,2,1,1,1,2),
				lon = c(-80.190, -66.118, -64.757, -80.190,  -70.579, -67.514, -66.668, -70.579, -70, -49, -51, -70),
				lat = c(26.774, 18.466, 32.321, 26.774, 28.745, 29.570, 27.339, 28.745, 22, 23, 22, 22))

p1 <- as.matrix(df[1:4, c("lon", "lat")])
p2 <- as.matrix(df[5:8, c("lon", "lat")])
p3 <- as.matrix(df[9:12, c("lon", "lat")])

# create `sf` collections

point <- sf::st_sfc(sf::st_point(x = c(df[1,"lon"], df[1,"lat"])))
multipoint <- sf::st_sfc(sf::st_multipoint(x = as.matrix(df[1:2, c("lon", "lat")])))
polygon <- sf::st_sfc(sf::st_polygon(x = list(p1, p2)))
linestring <- sf::st_sfc(sf::st_linestring(p3))
multilinestring <- sf::st_sfc(sf::st_multilinestring(list(p1, p2)))
multipolygon <- sf::st_sfc(sf::st_multipolygon(x = list(list(p1, p2), list(p3))))

# combine all types into one collection

sf <- rbind(
	sf::st_sf(geo = polygon),
	sf::st_sf(geo = multilinestring),
	sf::st_sf(geo = linestring),
	sf::st_sf(geo = point),
	sf::st_sf(geo = multipoint)
	)

sf

# Simple feature collection with 5 features and 0 fields
# geometry type:  GEOMETRY
# dimension:      XY
# bbox:           xmin: -80.19 ymin: 18.466 xmax: -49 ymax: 32.321
# epsg (SRID):    NA
# proj4string:    NA
#                              geo
# 1 POLYGON ((-80.19 26.774, -6...
# 2 MULTILINESTRING ((-80.19 26...
# 3 LINESTRING (-70 22, -49 23,...
# 4          POINT (-80.19 26.774)
# 5 MULTIPOINT (-80.19 26.774, ...

# encode sf objects

encode(sf)

                                       geo
# 1         POLYGON: ohlbDnbmhN~suq@am{tA...
# 2 MULTILINESTRING: ohlbDnbmhN~suq@am{tA...
# 3      LINESTRING: _{geC~zfjL_ibE_qd_C~...
# 4                     POINT: ohlbDnbmhN...
# 5                MULTIPOINT: ohlbDnbmhN...


# encode data frame as a list of points

encode(df)
# [1] "ohlbDnbmhN~suq@am{tAw`qsAeyhGvkz`@fge}Aw}_Kycty@gc`DesuQvvrLofdDorqGtzzVfkdh@uapB_ibE_qd_C~hbE~reK?~|}rB"

Polyline to well-known text

enc <- encode(sf)
wkt <- polyline_wkt(enc)
wkt
                                                                                                                                       geo
# 1 POLYGON ((-80.19 26.774, -66.1...
# 2 MULTILINESTRING ((-80.19 26.77...
# 3 LINESTRING (-70 22, -49 23, -5...
# 4          POINT (-80.19 26.774)...
# 5 MULTIPOINT ((-80.19 26.774),(-...

Well-known text to polyline

enc2 <- wkt_polyline(wkt)

Motivation

Encoding coordinates into polylines reduces the size of objects and can increase the speed in plotting Google Maps and Mapdeck

library(sf)
library(geojsonsf)
sf <- geojsonsf::geojson_sf("https://raw.githubusercontent.com/SymbolixAU/data/master/geojson/SA1_2016_VIC.json")

encoded <- encode(sf, FALSE)
encodedLite <- encode(sf, TRUE)

vapply(mget(c('sf', 'encoded', 'encodedLite') ), function(x) { format(object.size(x), units = "Kb") }, '')
#           sf      encoded  encodedLite 
# "38750.7 Kb" "14707.9 Kb"  "9649.8 Kb"
library(microbenchmark)
library(sf)
library(geojsonsf)
library(leaflet)
library(googleway)
library(mapdeck)

sf <- geojsonsf::geojson_sf("https://raw.githubusercontent.com/SymbolixAU/data/master/geojson/SA1_2016_VIC.json")

microbenchmark(

  google = {

    ## you need a Google Map API key to use this function
    google_map(key = mapKey) %>%
      add_polygons(data = sf)
  },
  
  mapdeck = {
    mapdeck(token = mapKey) %>%
      add_polygon(data = sf)
  },

  leaflet = {
    leaflet(sf) %>%
      addTiles() %>%
      addPolygons()
  },
  times = 25
)

# Unit: milliseconds
#     expr       min        lq      mean    median        uq       max neval
#   google  530.4193  578.3035  644.9472  606.3328  726.4577  897.9064    25
#  mapdeck  527.7255  577.2322  628.5800  600.7449  682.2697  792.8950    25
#  leaflet 3247.3318 3445.6265 3554.7433 3521.6720 3654.1177 4109.6708    25
 

These benchmarks don't account for the time taken for the browswer to render the maps

googlepolylines's People

Contributors

blacklime avatar chrismuir avatar dcooley avatar techisdead avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

googlepolylines's Issues

error if sf is not loaded

This errors

library(mapdeck)
df <- roads
encode(df[1:200, ])

But this works (the internals)

library(mapdeck)
df <- roads
encode(df[1:200, ])
geomCol <- googlePolylines:::sfGeometryColumn(df[1:200, ])
lst <- googlePolylines:::rcpp_encodeSfGeometry(df[[geomCol]], FALSE)

Decoding polylines

What's the best object to store multiple decoded polylines? list of lists, lists of data.frames?

Switch to C++14 under Boost 1.75.0

As mentioned in issue 76 at the BH repo, its new version requires a switch of googlePolylines to C++14 and will no longer compile with C++11. I have checked that this works, and can confirm that C++14 is allowed at CRAN. I have not checked if the switch to C++14 causes and issue under the current BH package (using Boost 1.72.0).

It would be great if you could consider upgrading to C++14 and upload an updated package, likely only once CRAN reopens in January, A simple patch is included below.

Or, per your open issue #44, maybe you prefer to no longer use BH which is fine too. Either way, I would love to upgrade BH without breaking any dependents.

Please reach out if you have any questions, and a big Thank You! for maintaining googlePolylines on CRAN.

diff -ru googlePolylines.orig/DESCRIPTION googlePolylines/DESCRIPTION
--- googlePolylines.orig/DESCRIPTION    2020-11-01 06:30:19.000000000 +0100
+++ googlePolylines/DESCRIPTION 2020-12-12 22:35:22.793196326 +0100
@@ -12,7 +12,7 @@
     using the 'Google' polyline encoding algorithm (<https://developers.google.com/maps/documentation/utilities/polylinealgorithm>).
 License: MIT + file LICENSE
 Encoding: UTF-8
-SystemRequirements: C++11
+SystemRequirements: C++14
 Depends: R (>= 3.3.0)
 Imports: Rcpp (>= 0.12.13)
 LinkingTo: Rcpp, BH
diff -ru googlePolylines.orig/src/Makevars googlePolylines/src/Makevars
--- googlePolylines.orig/src/Makevars   2020-05-25 02:21:49.000000000 +0200
+++ googlePolylines/src/Makevars        2020-12-12 22:35:30.685619918 +0100
@@ -1,3 +1,3 @@
-CXX_STD = CXX11
+CXX_STD = CXX14
 
 PKG_CPPFLAGS = -I../inst/include
Only in googlePolylines/src: Makevars~
diff -ru googlePolylines.orig/src/Makevars.win googlePolylines/src/Makevars.win
--- googlePolylines.orig/src/Makevars.win       2020-05-25 02:21:49.000000000 +0200
+++ googlePolylines/src/Makevars.win    2020-12-13 15:33:56.057435649 +0100
@@ -1,3 +1,3 @@
-CXX_STD = CXX11
+CXX_STD = CXX14
 
 PKG_CPPFLAGS = -I../inst/include

remove Boost

Just playing about with the library - no longer need it.

Handling NA input values

I ran into a few instances of encode() and decode() handling NA inputs in interesting ways. Here are some examples:

decode()

googlePolylines::decode(NA_character_)
#> [[1]]
#>      lat   lon
#> 1 -8e-05 1e-05

Should this return NA_real_, or a data frame containing two NA values? Either could be done within the Rcpp function rcpp_decode_polyline() without much trouble or additional code. I'd be happy to submit a PR for this if you'd like.

encode()

Example 1:
NA inputs round-tripped through encode/decode return numeric lat/lon coordinates.

df <- data.frame(lat = c(NA_real_), lon = c(NA_real_))

(res <- googlePolylines::encode(df))
#> [1] ">>"

googlePolylines::decode(res)
#> [[1]]
#>        lat      lon
#> 1 -0.00016 -0.00016

Example 2:
Data frame with four observations round-tripped through encode/decode returns a data frame with only three observations.

df <- data.frame(lat = c(38.5, 40.7, NA_real_, 43.252), 
                 lon = c(-120.2, -120.95, NA_real_, -126.453))

(res <- googlePolylines::encode(df))
#> [1] "_p~iF~ps|U_ulLnnqC_\016\x9e\xd7"

googlePolylines::decode(res)
#> [[1]]
#>       lat      lon
#> 1 38.5000 -120.200
#> 2 40.7000 -120.950
#> 3 40.7024 -120.954

empty geometry causes crash

Reproducible

These all cause RStudio / R to crash

googlePolylines::encode( sf::st_sfc(sf::st_multipoint()) )
googlePolylines::encode( sf::st_sfc(sf::st_multilinestring()) )
googlePolylines::encode( sf::st_sfc(sf::st_multipolygon()) )

non-MULTI* objects are fine

googlePolylines::encode( sf::st_sfc(sf::st_point()) )
googlePolylines::encode( sf::st_sf(geometry = sf::st_sfc(sf::st_point())) )

googlePolylines::encode( sf::st_sfc(sf::st_linestring()) )
googlePolylines::encode( sf::st_sf(geometry = sf::st_sfc(sf::st_linestring())) )

googlePolylines::encode( sf::st_sfc(sf::st_polygon()) )
googlePolylines::encode( sf::st_sf(geometry = sf::st_sfc(sf::st_polygon())) )

make_type

return void as the 'sf_type' is not used.

unnest GEOMETRYCOLLECTIONS

googleway can't handle GeometryCollections (yet). It needs to be split into its constituent geometries. Should this be done here, or in geojsonsf, or in googlway ?

encode valid range [-9999,9999]

coords need to be in [-9999, 9999]

e <- encodeCoordinates( lon = -9999, lat = 9999 )
e
# [1] "_u`drz@~t`drz@"
decode( e )

TODO

  • store Z and M in list-columns

MULTIPOINT

currently it creates a new polyline for each individual POINT. Should this in fact behave the same as a LINESTRING? (given we keep the attribute so we know the geometry type)?

tbl_df causing errors from encode(sf)

This comes from sf::read_sf("/sf/file/.shp"), which builds it using a tibble.

But, sf::st_read() is fine...


TODO

  • test on googleway
  • test on mapdeck
  • include a test for custom classes (and tibble!)

sfc objects

0.6.2 returns an unnamed list for XY, and a named list for XYZ[M] sfc objects.

Should the both be made consistent, and / or return an sfencoded object?

library(sf)

encode( sf::st_sfc(sf::st_point(1:2) ) )

# [[1]]
# [1] "_seK_ibE"
# attr(,"sfc")
# [1] "POINT"

encode( sf::st_sfc(sf::st_point(1:3) ) )

# $XY
# $XY[[1]]
# [1] "_seK_ibE"
# attr(,"sfc")
# [1] "POINT"
# 
# 
# $ZM
# $ZM[[1]]
# [1] "?_}hQ"
# attr(,"zm")
# [1] "XYZ"

header-only

restructure everything

TODO

  • use sfheaders to go to & from sf objects.
  • wait until Rcpp 1.0.4.6 is on CRAN - r-lib/fs#256

encode SF

  • MULTIPOLYGON & POLYGON
  • MULTILINESTRING & LINESTRING
  • MULTIPOINT & POINT
  • separate MULTIPOLYGONs - currently using SPLIT_CHAR
  • keep sf attributes (unless strip = T)

wkt updates

  • wkt_polyline on non-sfencoded objects: wkt_polyline(x, col = ... )
  • polyline_wkt on non-sfencoded objects: polyline_wkt(x, col = ... )

elevation

this needs to be handled

related:


TODO

  • ZM attribute column, with attributes saying which of Z and/or M are included
  • decoding the ZM attribute column
  • POINT
  • MULTIPOINT
  • LINESTRING
  • MULTILINESTRING
  • POLYGON
  • MULTIPOLYGON
  • speed test
  • decode() to return Z/M when decoding the Z/M attributes
  • sfcs converted to sfencoded object, or keep as list? - keep as is
  • use paste0(geomCol, "ZM") (or similar) as column & list name in encoded object?
  • use XY or geometry in encoded sfc object? - defined by the sf object column
  • print method
  • str method
  • as.data.frame removes zm attributes
  • column-name conflicts - e.g., if sf[, 'ZM'] already exists before encoding
  • googleway still works
  • mapdeck still works
  • bump version on merge
  • subset method

print.sfencoded

the 'LITE' object may need a different print method. Or classed as something like sfencodedLite

Multiple geometry columns

library(sf)
sf <- st_sf(a = c("x", "y"), geom = st_sfc(st_point(3:4), st_point(3:4)))
sf[["geom2"]] = st_sfc(st_point(3:4), st_point(3:4))
sf
Simple feature collection with 2 features and 1 field
Active geometry column: geom
geometry type:  POINT
dimension:      XY
bbox:           xmin: 3 ymin: 4 xmax: 3 ymax: 4
epsg (SRID):    NA
proj4string:    NA
  a        geom       geom2
1 x POINT (3 4) POINT (3 4)
2 y POINT (3 4) POINT (3 4)

should encode.sf apply to multiple columns?

sfGeometryColumn.sf <- function(sf) names( which( sapply( sf, function(x) inherits(x, "sfc") ) ) )

Installation error with old g++ and CXX14

I was trying to install this on Debian Jessie (where sadly the libjq-dev package does not exist). I manually downloaded and installed a later one but then had error:

* installing *source* package ‘googlePolylines’ ...
** package ‘googlePolylines’ successfully unpacked and MD5 sums checked
** libs
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include  -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/usr/local/lib/R/site-library/BH/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -c RcppExports.cpp -o RcppExports.o
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include  -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/usr/local/lib/R/site-library/BH/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -c encode.cpp -o encode.o
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include  -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/usr/local/lib/R/site-library/BH/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -c googlePolylines.cpp -o googlePolylines.o
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include  -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/usr/local/lib/R/site-library/BH/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -c wkt.cpp -o wkt.o
wkt.cpp: In function ‘void addLonLatToWKTStream(std::ostringstream&, float, float)’:
wkt.cpp:17:9: error: ‘to_string’ is not a member of ‘std’
   os << std::to_string(lon) << " " << std::to_string(lat);
         ^
wkt.cpp:17:39: error: ‘to_string’ is not a member of ‘std’
   os << std::to_string(lon) << " " << std::to_string(lat);
                                       ^
wkt.cpp: In function ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&)’:
wkt.cpp:301:14: error: ISO C++ forbids declaration of ‘i’ with no type [-fpermissive]
   for (auto& i: pl.inners() ) {
              ^
wkt.cpp:301:17: error: range-based ‘for’ loops are not allowed in C++98 mode
   for (auto& i: pl.inners() ) {
                 ^
wkt.cpp: In instantiation of ‘void encode_wkt_linestring(const LineString&, std::ostringstream&) [with LineString = int; std::ostringstream = std::basic_ostringstream<char>]’:
wkt.cpp:302:32:   required from ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&) [with Polygon = boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> >; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:383:32:   required from here
wkt.cpp:265:42: error: no matching function for call to ‘begin(const int&)’
   for (iterator_type it = boost::begin(ls);
                                          ^
wkt.cpp:265:42: note: candidates are:
In file included from /usr/local/lib/R/site-library/BH/include/boost/algorithm/string/trim.hpp:16:0,
                 from /usr/local/lib/R/site-library/BH/include/boost/algorithm/string.hpp:19,
                 from wkt.cpp:5:
/usr/local/lib/R/site-library/BH/include/boost/range/begin.hpp:97:55: note: template<class T> typename boost::range_iterator<C>::type boost::range_adl_barrier::begin(T&)
 inline BOOST_DEDUCED_TYPENAME range_iterator<T>::type begin( T& r )
                                                       ^
/usr/local/lib/R/site-library/BH/include/boost/range/begin.hpp:97:55: note:   template argument deduction/substitution failed:
/usr/local/lib/R/site-library/BH/include/boost/range/begin.hpp: In substitution of ‘template<class T> typename boost::range_iterator<C>::type boost::range_adl_barrier::begin(T&) [with T = const int]’:
wkt.cpp:265:42:   required from ‘void encode_wkt_linestring(const LineString&, std::ostringstream&) [with LineString = int; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:302:32:   required from ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&) [with Polygon = boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> >; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:383:32:   required from here
/usr/local/lib/R/site-library/BH/include/boost/range/begin.hpp:97:55: error: no type named ‘type’ in ‘struct boost::range_iterator<const int, void>’
wkt.cpp: In instantiation of ‘void encode_wkt_linestring(const LineString&, std::ostringstream&) [with LineString = int; std::ostringstream = std::basic_ostringstream<char>]’:
wkt.cpp:302:32:   required from ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&) [with Polygon = boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> >; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:383:32:   required from here
/usr/local/lib/R/site-library/BH/include/boost/range/begin.hpp:106:61: note: template<class T> typename boost::range_iterator<const T>::type boost::range_adl_barrier::begin(const T&)
 inline BOOST_DEDUCED_TYPENAME range_iterator<const T>::type begin( const T& r )
                                                             ^
/usr/local/lib/R/site-library/BH/include/boost/range/begin.hpp:106:61: note:   template argument deduction/substitution failed:
/usr/local/lib/R/site-library/BH/include/boost/range/begin.hpp: In substitution of ‘template<class T> typename boost::range_iterator<const T>::type boost::range_adl_barrier::begin(const T&) [with T = int]’:
wkt.cpp:265:42:   required from ‘void encode_wkt_linestring(const LineString&, std::ostringstream&) [with LineString = int; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:302:32:   required from ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&) [with Polygon = boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> >; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:383:32:   required from here
/usr/local/lib/R/site-library/BH/include/boost/range/begin.hpp:106:61: error: no type named ‘type’ in ‘struct boost::range_iterator<const int, void>’
wkt.cpp: In instantiation of ‘void encode_wkt_linestring(const LineString&, std::ostringstream&) [with LineString = int; std::ostringstream = std::basic_ostringstream<char>]’:
wkt.cpp:302:32:   required from ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&) [with Polygon = boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> >; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:383:32:   required from here
wkt.cpp:266:27: error: no matching function for call to ‘end(const int&)’
        it != boost::end(ls);
                           ^
wkt.cpp:266:27: note: candidates are:
In file included from /usr/local/lib/R/site-library/BH/include/boost/algorithm/string/trim.hpp:17:0,
                 from /usr/local/lib/R/site-library/BH/include/boost/algorithm/string.hpp:19,
                 from wkt.cpp:5:
/usr/local/lib/R/site-library/BH/include/boost/range/end.hpp:91:55: note: template<class T> typename boost::range_iterator<C>::type boost::range_adl_barrier::end(T&)
 inline BOOST_DEDUCED_TYPENAME range_iterator<T>::type end( T& r )
                                                       ^
/usr/local/lib/R/site-library/BH/include/boost/range/end.hpp:91:55: note:   template argument deduction/substitution failed:
/usr/local/lib/R/site-library/BH/include/boost/range/end.hpp: In substitution of ‘template<class T> typename boost::range_iterator<C>::type boost::range_adl_barrier::end(T&) [with T = const int]’:
wkt.cpp:266:27:   required from ‘void encode_wkt_linestring(const LineString&, std::ostringstream&) [with LineString = int; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:302:32:   required from ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&) [with Polygon = boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> >; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:383:32:   required from here
/usr/local/lib/R/site-library/BH/include/boost/range/end.hpp:91:55: error: no type named ‘type’ in ‘struct boost::range_iterator<const int, void>’
wkt.cpp: In instantiation of ‘void encode_wkt_linestring(const LineString&, std::ostringstream&) [with LineString = int; std::ostringstream = std::basic_ostringstream<char>]’:
wkt.cpp:302:32:   required from ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&) [with Polygon = boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> >; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:383:32:   required from here
/usr/local/lib/R/site-library/BH/include/boost/range/end.hpp:100:61: note: template<class T> typename boost::range_iterator<const T>::type boost::range_adl_barrier::end(const T&)
 inline BOOST_DEDUCED_TYPENAME range_iterator<const T>::type end( const T& r )
                                                             ^
/usr/local/lib/R/site-library/BH/include/boost/range/end.hpp:100:61: note:   template argument deduction/substitution failed:
/usr/local/lib/R/site-library/BH/include/boost/range/end.hpp: In substitution of ‘template<class T> typename boost::range_iterator<const T>::type boost::range_adl_barrier::end(const T&) [with T = int]’:
wkt.cpp:266:27:   required from ‘void encode_wkt_linestring(const LineString&, std::ostringstream&) [with LineString = int; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:302:32:   required from ‘void encode_wkt_polygon(const Polygon&, std::ostringstream&) [with Polygon = boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> >; std::ostringstream = std::basic_ostringstream<char>]’
wkt.cpp:383:32:   required from here
/usr/local/lib/R/site-library/BH/include/boost/range/end.hpp:100:61: error: no type named ‘type’ in ‘struct boost::range_iterator<const int, void>’
/usr/lib/R/etc/Makeconf:141: recipe for target 'wkt.o' failed
make: *** [wkt.o] Error 1
ERROR: compilation failed for package ‘googlePolylines’

For reference:

g++ (Debian 4.9.2-10+deb8u2) 4.9.2

and Rcpp:

packageDescription("Rcpp")$Version 
[1] "1.0.6"

and my MakeConf

CXX = g++
CXXCPP = $(CXX) -E
CXXFLAGS = -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g $(LTO)
CXXPICFLAGS = -fpic
CXX1X = g++
CXX1XFLAGS = -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g
CXX1XPICFLAGS = -fpic
CXX1XSTD =  -std=c++11

I'm no Rcpp user so I just edited my CXX to

CXX = g++ -std=c++1y

and I could compile the library without any issues so far but maybe you know of a better fix.

List objects

this works

point <- sf::st_sfc(sf::st_point(x = c(144, -37)), sf::st_point(x = c(144, -37)))
encode(point)

Need to make a class for the list (or convert to data.frame), and allow polyline_wkt to work.

Also will need to work when [ , , drop = TRUE]

ASAN and UBSAN sanitizers fail

Compiler warning when running sanitisers:

googlePolylines.cpp:100:21: runtime error: left shift of negative value -3780749

Due to

void EncodeSignedNumber(std::ostringstream& os, int val){
  int sgn_num;
  sgn_num = val << 1;
  
  if (sgn_num < 0) {
    sgn_num = ~sgn_num;
  }
  ...
}

Potential solution

void EncodeSignedNumber(std::ostringstream& os, int32_t val){
  uint32_t usgn_num;
  usgn_num = (val < 0) ? ~((~val)+1)+1 : val; 
  usgn_num <<= 1;
  usgn_num = (val < 0) ? (~usgn_num) : usgn_num;
  ...
}

Although this seems completely unnecessary, and the original solution just works...

Google algorithm specification - the 'double flip + 1' step is confusing...

polyline_wkt loses precision

library(sf)
library(googlePolylines)

coords <- matrix(c(-37.1234567, -37.765321, 144.1234567, 144.7654321), ncol = 2)
sf_line <- sf::st_sf(geometry = sf::st_sfc(sf::st_linestring(coords)))

polyline_wkt(encode(sf_line))[1, ]
# [1] "LINESTRING (-37.1234 144.123, -37.7653 144.765)"

Handling NA values

encode()

Example 1:
NA inputs round-tripped through encode/decode return numeric lat/lon coordinates.

df <- data.frame(lat = c(NA_real_), lon = c(NA_real_))

(res <- googlePolylines::encode(df))
#> [1] ">>"

googlePolylines::decode(res)
#> [[1]]
#>        lat      lon
#> 1 -0.00016 -0.00016

Example 2:
Data frame with four observations round-tripped through encode/decode returns a data frame with only three observations.

df <- data.frame(lat = c(38.5, 40.7, NA_real_, 43.252), 
                 lon = c(-120.2, -120.95, NA_real_, -126.453))

(res <- googlePolylines::encode(df))
#> [1] "_p~iF~ps|U_ulLnnqC_\016\x9e\xd7"

googlePolylines::decode(res)
#> [[1]]
#>       lat      lon
#> 1 38.5000 -120.200
#> 2 40.7000 -120.950
#> 3 40.7024 -120.954

Longitude off by a factor of 10?

Hi @dcooley, I'm not sure why this happened and cannot see anything in the code, but heads-up that when decoding a polyline from Valhalla the lon axis was 10x more than what it should have been:

Someone kindly pointed this out and I've just merged their PR: https://github.com/Robinlovelace/rvalhalla/pull/3/files

I imagine this is due to Valhalla's version, which does do something with "10" from the docs but not sure what: https://valhalla.github.io/valhalla/decoding/

This is most likely a non-issue that can just be closed but heads-up in case of use/interest.

Decode to sf / constructing sfencoded

Thanks for this cool package! I currently have a use case where I get given a long vector of polylines, and I'm wondering whether it makes sense to have one of:

  1. A version or option for decode that produces an sf object instead of requiring the user to convert the list of dataframes themselves.
  2. A constructor for sfencoded that does the work of creating an encoded column and sets class so that polyline_wkt %>% st_as_sf can be used.

Both of these are certainly convenience functions. I can try a PR in the next few days once I read over the code a bit also.

GeoJSON to encoded polylines

similar to how geojsonsf encodes GeoJSON - use the same 'geojson_to_` code, but at each iteration of extracting lon/lat, encode it into a polyline.

It would replace this workflow

library(geojsonsf)
library(googlePolylines)

googlePolylines::encode(
  geojsonsf::geojson_sf(geo_melbourne)
)

remove Boost

TODO

  • remove boost/algorithm
  • remove boost/geometry

Refactor encode() Rcpp code

Hey, so I worked some on edits to the Rcpp functions related to encode(), trying to squeeze more speed out of them. I made some minor changes and got ~40% speed up in the benchmarks below. Most of the changes are just building std vectors in a namespace that get over-written in each loop, as opposed to creating multiple NumericVectors in each loop iteration. Per usual, all tests are passing locally.

I wanted to bring this up here because my one open pr is already getting a little overgrown. Let me know if you're interested, I can either push to the open PR, or wait and open a new one later.

Benchmarks

I have no idea if these tests are representative of common use, I essentially lifted them from test-Encode.R and just made the polygon list longer (100,000 elements).

polygons

library(googlePolylines)
library(sf)
polygon <- sf::st_sfc(sf::st_polygon(
  list(matrix(c(144, 144.1, 144.2, 144, -37, -37.1, -37.2, -37), ncol = 2))))
polygon <- rep(polygon, 100000)

Original code

microbenchmark::microbenchmark(encode(polygon), times = 10)
#> Unit: milliseconds
#>             expr     min       lq     mean   median       uq     max neval
#> encode(polygon) 388.162 389.4418 439.4863 450.9892 465.9901 489.101    10

Updated code

microbenchmark::microbenchmark(encode(polygon), times = 10)
#> Unit: milliseconds
#>             expr      min      lq     mean   median       uq      max neval
#> encode(polygon) 281.2151 292.136 300.4025 299.5898 303.0663 323.9693    10

multipolygons

library(googlePolylines)
library(sf)
sf <- sf::st_sf(geo = polygon)
m1 <- matrix(c(144, 144.1, 144.2, 144, -37, -37.1, -37.2, -37), ncol = 2)
m2 <- m1 + 1
multipolygon <- sf::st_sfc(sf::st_multipolygon(list(list(m1, m2))))
multipolygon <- rep(multipolygon, 100000)

Original code

microbenchmark::microbenchmark(encode(multipolygon), times = 10)
#> Unit: milliseconds
#>                  expr     min     lq     mean   median       uq      max neval
#> encode(multipolygon) 638.299 657.18 707.6419 720.7405 744.7296 769.4089    10

Updated code

microbenchmark::microbenchmark(encode(multipolygon), times = 10)
#> Unit: milliseconds
#>                  expr      min       lq     mean   median       uq      max neval
#> encode(multipolygon) 432.4685 446.4945 464.7041 451.4713 464.8576 540.8584    10

Error in rcpp_decode_polyline: basic_string

This is caused by too many backslashes in the string, likely the result of scraping a web page using readLines()

googlePolylines::decode("gw{FcnvxRgA\\\\_F|A{Jj@]@")
# Error in rcpp_decode_polyline(polylines, "coords") : basic_string

The solution appears to be to use scan(, allowEscapes = T) rather than readLines()

GEOMETRYCOLLECTION

A geometry collection is a collection of geometries. While the individual geometries can be encoded, storing the encoded GEOMETRYCOLLECTION geometries as one 'row' of a data.frame requires some thought:

p <- rbind(c(3.2,4), c(3,4.6), c(3.8,4.4), c(3.5,3.8), c(3.4,3.6), c(3.9,4.5))
(mp <- st_multipoint(p))

s1 <- rbind(c(0,3),c(0,4),c(1,5),c(2,5))
(ls <- st_linestring(s1))

s2 <- rbind(c(0.2,3), c(0.2,4), c(1,4.8), c(2,4.8))
s3 <- rbind(c(0,4.4), c(0.6,5))
(mls <- st_multilinestring(list(s1,s2,s3)))

p1 <- rbind(c(0,0), c(1,0), c(3,2), c(2,4), c(1,4), c(0,0))
p2 <- rbind(c(1,1), c(1,2), c(2,2), c(1,1))
pol <-st_polygon(list(p1,p2))
p3 <- rbind(c(3,0), c(4,0), c(4,1), c(3,1), c(3,0))
p4 <- rbind(c(3.3,0.3), c(3.8,0.3), c(3.8,0.8), c(3.3,0.8), c(3.3,0.3))[5:1,]
p5 <- rbind(c(3,3), c(4,2), c(4,3), c(3,3))
(mpol <- st_multipolygon(list(list(p1,p2), list(p3,p4), list(p5))))

(gc <- st_geometrycollection(list(mp, mpol, ls)))

sf <- sf::st_sf(geo = sf::st_sfc(gc))
sf$geo[[1]]
GEOMETRYCOLLECTION (MULTIPOINT (3.2 4, 3 4.6, 3.8 4.4, 3.5 3.8, 3.4 3.6, 3.9 4.5), MULTIPOLYGON (((0 0, 1 0, 3 2, 2 4, 1 4, 0 0), (1 1, 1 2, 2 2, 1 1)), ((3 0, 4 0, 4 1, 3 1, 3 0), (3.3 0.3, 3.3 0.8, 3.8 0.8, 3.8 0.3, 3.3 0.3)), ((3 3, 4 2, 4 3, 3 3))), LINESTRING (0 3, 0 4, 1 5, 2 5))

Does it even make sense to store this as encoded polylines?

decode always prints the input

I updated to the latest version of googlePolylines and now decode always prints the input character string.

For example:

input <- "ohlbDnbmhN~suq@am{tAw`qsAeyhGvkz`@fge}A"
output <- googlePolylines::decode(input)
"ohlbDnbmhN~suq@am{tAw`qsAeyhGvkz`@fge}A"

This is not a problem for a single line, but when you parse lots of lines it fills the consol with text.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.