Comments (14)
I can reproduce but am not sure beyond that. Crossing 180/-180 seems likely the issue. I'll note that searching for "pollen" on https://www.pangaea.de/ and setting the bounding box there also prevents this record from being returned. So I think this might be an API issue, not a pangaear issue.
from pangaear.
Reversing the order of longitudes with number of counts 60 finds it.
pg_search(query = "pollen", bbox = c(-171.700000, 42.320000, 51.840000, 74.550000), count = 60)
But, then most of the datasets that are pulled together with it are from the west.
Otherwise, even a composite query such as: "sediment pollen fossil", "sediment pollen", "fossil pollen", and "pollen", each with 500 counts and offset 0, 500, ...2500 cannot find the dataset.
So, it seems as a 180/-180 issue.
from pangaear.
Thanks @karawoo I've also tried searching with longitudes in a 0-360 to see if that works, but doesn't. It's not clear if the bbox in the "Coverage" section of a dataset has to be completely encompassed by a search or not. I imagine it does have to be since this search doesn't find that dataset?
from pangaear.
@kbh022 that bbox c(-171.700000, 42.320000, 51.840000, 74.550000)
does seem more correct, since it should be minlon, minlat, maxlon, maxlat.
from pangaear.
Hi this looks like an issue on PANGAEA's side. It happens for datasets which cross dateline (its bbox). Our code does queries including date line correct, but the combination is broken.
I will work on a fix.
The Soap Api of PANGAEA allows 3 types of searches: intersection, full included and mean only. The web site only offers the first variant, so the bounding boxes of dataset and query need to overlap for a match. The score is ranked by distance between center of search box and mean point of dataset. This will score datasets that overlap more with higher factor. This is why you see different order if you invert the box. It then matches (because of this bug), but the score gets very low (as it's far outside the inverted box).
from pangaear.
Thanks for this information @uschindler and for working on a fix. I'll add to the package documentation how the bounding box search is done (w/ intersection)
while you're here, curious if the Data Warehouse downloads are available programatically, or only in the web interface?
from pangaear.
The fix does not seem easy. It affects all datasets which cross date line.
Your second question: the data warehouse is only available to users logged in, so you need a login token. But this will be available soon: users can create an api token (like on GitHub) that can be used to download datasets on behalf of some user. This allows to create and share scripts like a pangaear or pangaeapy script without including username and password.
The API for the data warehouse is included in our Soap Api, it's not available via REST yet.
from pangaear.
Okay, will look out for the warehouse token update
from pangaear.
Just some update: Hi we can't fix the dateline issue at the moment easily, as this is a bug in the underlying Elasticsearch engine, which is not yet fixed: elastic/elasticsearch#22564
We may change to polygons, but that slows down.
from pangaear.
Thanks for the update.
from pangaear.
Hi,
The issue was fixed on PANGAEA's API. Searching for datasets with bounding boxes crossing the date line ist now fully supported. Precision for search is 5km or 2.5% of size of shape (if large).
from pangaear.
great, thanks very much!
from pangaear.
working now, closin
from pangaear.
from pangaear.
Related Issues (20)
- some datasets require login HOT 1
- Datasets with png files failing HOT 7
- Add a vignette
- Add GitHub topics HOT 3
- `pg_data` bug: Pangaea I think switched many datasets download setup
- set download directory HOT 10
- parse metadata in data text files
- response content type header changed - causing pg_data to fail HOT 1
- vcr-ify tests where possible
- test fixture problem, related to yaml pkg HOT 1
- Vignette is missing title
- vcr cache pg_data tests
- consistent event handling HOT 3
- error on oldrel macos
- Column names in data almost always contain spaces/special characters, are sometimes not unique HOT 6
- Support passing bearer token for authorization (allows downloading protected datasets) and use HTTP content negotiation HOT 1
- Maintenance status / help needed? HOT 4
- read_meta fails with large multi event data sets
- Search by lat/lon without defining query? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pangaear.