Comments (4)
(As this is a question, not a feature request or bug report, please direct to StackOverflow in the future.)
- I think there's some confusion here - you can use the
polyfill
function to return the hexagons within a polygon, but they are all the same resolution (supplied by the caller). You would have to callcompact
on the resulting hexagons in order to get hexagons in multiple resolutions.
I am not a Hive expert, but there are a few options for using H3 to spatially index data points:
-
The simplest is to use
polyfill
to fill a given polygon at a resolution that fits your desired precision, and then create a table using the hexagons as a reverse index with rows likeh3index, polygon_id
. Data points with lat/lon can then be mapped to a H3 index usinggeoToH3
(ideally at index time), and you can join this field with the polygon table to find the polygon id for a given data point. This is generally very fast, but the reverse index can get very large depending on the size and precision of the polygons you need to index. -
A slower but more space-efficient option uses
compact
to index the polygon at multiple resolutions. Assigning a data point to a polygon would then require performingn
joins or queries, wheren
is the number of different resolutions in the compacted set. This might be a better solution if you needed to cover a significant geographic area with high precision (making a standard reverse index very large) but generally only had to handle a single data point at a time (e.g. in a geocoding API).
Obviously if your polygons are stable, you'd be better off doing this at index time, storing the polygon id with the data point, rather than at query time.
from h3.
As noted, I don't actually know much about Hive :). This tradeoff was something we were considering with Cassandra queries at one point.
from h3.
@nrabinowitz I don't think the second one would be that much slower. If you take your given lat, lng coordinates and compute the H3 index at all of the possible resolutions (say 6, 7, 8, 9 for your compacted set) and simply query for any of those 4 matching, it would just be four integer compares until it either succeeds or fails, and that would probably actually be faster than a single comparison across the entire uncompacted set.
If it was a normal non-Hive database, you could index the H3 integers with a hash index and it would literally be just 4 hash lookups and you'd get the answer in O(1)
time instead of O(n)
, but that's the trade-off you make with the Hadoop ecosystem versus classic DBs (more space, but slower queries).
from h3.
@dfellis join multiple times seems too stupid on the hive SQL grammar side, however, so far, it seems the best solution.
from h3.
Related Issues (20)
- replace `sprintf` with `snprintf` HOT 1
- Add additional modes for polygonToCells HOT 12
- Hex (cell) ID validation HOT 3
- Fuzzer timeout on fuzzerIj: gridPathCells
- Broken Link to website docs in contributing.md
- Broken link to website in contributing docs
- Uber CLA Contact HOT 1
- Has cell_to_vertex been implemented? HOT 2
- Replace empty function parameters with `void` HOT 1
- cell_to_child_pos() version 4 of the Python API client HOT 3
- polygonToCells: validity of polygons HOT 3
- Missing library stubs MYPY HOT 2
- polygonToCells not returning all H3Cells for the bounding box containing both USA and Russia HOT 1
- Confirmation of grid algorithm HOT 3
- cellToChildren error HOT 2
- Add function for returning the H3 indices of each endpoint of a directed edge HOT 5
- Expose cellToChildrenSize in bindings HOT 1
- Getting unexpected results when converting coordinates in either direction HOT 4
- Meta: blog post has broken images HOT 5
- API | distance between h3s challenging to work around HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h3.