friedrichknuth / gtsa Goto Github PK
View Code? Open in Web Editor NEWMethods to stack geospatial rasters and run memory-efficient computations.
License: MIT License
Methods to stack geospatial rasters and run memory-efficient computations.
License: MIT License
Below are existing efforts that I found which could be useful to discuss and define GTSA's clear objective and build its core structure during the Hackweek:
The obvious dependencies that are now more stable:
Apart from GeoWombat's Time Series section, I don't see anything that does what GTSA currently does (scalable spatiotemporal prediction). GeoWombat are also the only ones providing an interface to ingest raster data + chunk it + process it. The limitation is that they have to maintain all these aspects at once in a single package. While GTSA can leave the ingestion + chunking + vector operations to Rioxarray + Geocube for the most part, and focus on making the link to more easily apply scalable method on the processing side. I really like their approach of allowing any PyTorch & other algorithm to be passed, we should probably aim towards something similar.
So, in terms of package objectives, I see two core aspects:
SatelliteImage
class in GeoUtils: https://geoutils.readthedocs.io/en/latest/satimg_class.html, but it'll take a while).In terms of ideal code structure: I'm not sure what is best... Definitely not a class-based object. I feel that an Xarray accessor could maybe work quite nicely? But we'd need to grasp all the implications for out-of-memory ops.
For instance:
import gtsa
# The package itself would only be called to open the list of files and stack them out-of-memory to a certain disk location
ds = gtsa.open_rasterstack(list_raster_files=..., zarr_file=...)`
# (or this could be several functions if needed: define different tiling types? areas with different projections?)
# Then the Xarray accessor would do everything else
# For example, define additional Xarray attributes to ensure the time/space units are known, or to store the covariance of the data in space and time (based on ObsArray, maybe, if it takes off)
ds.gtsa.time_unit
ds.gtsa.space_unit
# For prediction: have a fit/apply function that returns predicted values at new spatiotemporal locations
ds_pred = ds.gtsa.predict(method=..., time_pred=..., x_pred=...)
ds_pred.to_zarr(zarr_file=...).
Do you think that would work (even out-of-memory)?
That's all I've got for now ๐!
Currently, data are stored as float64
by default, which is excessive precision for most analyses. Other data types should be made optional to reduce the size of the Zarr stack on disk.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.