Code Monkey home page Code Monkey logo

Comments (4)

nrabinowitz avatar nrabinowitz commented on July 18, 2024 1

(As this is a question, not a feature request or bug report, please direct to StackOverflow in the future.)

  • I think there's some confusion here - you can use the polyfill function to return the hexagons within a polygon, but they are all the same resolution (supplied by the caller). You would have to call compact on the resulting hexagons in order to get hexagons in multiple resolutions.

I am not a Hive expert, but there are a few options for using H3 to spatially index data points:

  • The simplest is to use polyfill to fill a given polygon at a resolution that fits your desired precision, and then create a table using the hexagons as a reverse index with rows like h3index, polygon_id. Data points with lat/lon can then be mapped to a H3 index using geoToH3 (ideally at index time), and you can join this field with the polygon table to find the polygon id for a given data point. This is generally very fast, but the reverse index can get very large depending on the size and precision of the polygons you need to index.

  • A slower but more space-efficient option uses compact to index the polygon at multiple resolutions. Assigning a data point to a polygon would then require performing n joins or queries, where n is the number of different resolutions in the compacted set. This might be a better solution if you needed to cover a significant geographic area with high precision (making a standard reverse index very large) but generally only had to handle a single data point at a time (e.g. in a geocoding API).

Obviously if your polygons are stable, you'd be better off doing this at index time, storing the polygon id with the data point, rather than at query time.

from h3.

nrabinowitz avatar nrabinowitz commented on July 18, 2024 1

As noted, I don't actually know much about Hive :). This tradeoff was something we were considering with Cassandra queries at one point.

from h3.

dfellis avatar dfellis commented on July 18, 2024

@nrabinowitz I don't think the second one would be that much slower. If you take your given lat, lng coordinates and compute the H3 index at all of the possible resolutions (say 6, 7, 8, 9 for your compacted set) and simply query for any of those 4 matching, it would just be four integer compares until it either succeeds or fails, and that would probably actually be faster than a single comparison across the entire uncompacted set.

If it was a normal non-Hive database, you could index the H3 integers with a hash index and it would literally be just 4 hash lookups and you'd get the answer in O(1) time instead of O(n), but that's the trade-off you make with the Hadoop ecosystem versus classic DBs (more space, but slower queries).

from h3.

harryprince avatar harryprince commented on July 18, 2024

@dfellis join multiple times seems too stupid on the hive SQL grammar side, however, so far, it seems the best solution.

from h3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.