Code Monkey home page Code Monkey logo

geedim's Introduction

Tests codecov PyPI version conda-forge docs License

geedim

Search, composite, and download Google Earth Engine imagery, without size limits.

Description

geedim provides a command line interface and API for searching, compositing and downloading satellite imagery from Google Earth Engine (EE). It optionally performs cloud/shadow masking, and cloud/shadow-free compositing on supported collections. Images and composites can be downloaded; or exported to Google Drive, Earth Engine asset or Google Cloud Storage. Images larger than the EE size limit are split and downloaded as separate tiles, then re-assembled into a single GeoTIFF.

See the documentation site for more detail: https://geedim.readthedocs.io/.

Cloud/shadow support

Any EE imagery can be searched, composited and downloaded by geedim. Cloud/shadow masking, and cloud/shadow-free compositing are supported on the following collections:

EE name Description
LANDSAT/LT04/C02/T1_L2 Landsat 4, collection 2, tier 1, level 2 surface reflectance.
LANDSAT/LT05/C02/T1_L2 Landsat 5, collection 2, tier 1, level 2 surface reflectance.
LANDSAT/LE07/C02/T1_L2 Landsat 7, collection 2, tier 1, level 2 surface reflectance.
LANDSAT/LC08/C02/T1_L2 Landsat 8, collection 2, tier 1, level 2 surface reflectance.
LANDSAT/LC09/C02/T1_L2 Landsat 9, collection 2, tier 1, level 2 surface reflectance.
COPERNICUS/S2 Sentinel-2, level 1C, top of atmosphere reflectance.
COPERNICUS/S2_SR Sentinel-2, level 2A, surface reflectance.
COPERNICUS/S2_HARMONIZED Harmonised Sentinel-2, level 1C, top of atmosphere reflectance.
COPERNICUS/S2_SR_HARMONIZED Harmonised Sentinel-2, level 2A, surface reflectance.

Installation

geedim is a python 3 package, and requires users to be registered with Google Earth Engine.

It can be installed with pip or conda.

pip

pip install geedim

conda

conda install -c conda-forge geedim

Authentication

Following installation, Earth Engine should be authenticated:

earthengine authenticate

Getting started

Command line interface

geedim command line functionality is accessed through the commands:

  • search: Search for images.
  • composite: Create a composite image.
  • download: Download image(s).
  • export: Export image(s).
  • config: Configure cloud/shadow masking.

Get help on geedim with:

geedim --help

and help on a geedim command with:

geedim <command> --help
Examples

Search for Landsat-8 images, reporting cloudless portions.

geedim search -c l8-c2-l2 -s 2021-06-01 -e 2021-07-01 --bbox 24 -33 24.1 -33.1 --cloudless-portion

Download a Landsat-8 image with cloud/shadow mask applied.

geedim download -i LANDSAT/LC08/C02/T1_L2/LC08_172083_20210610 --bbox 24 -33 24.1 -33.1 --mask

Command pipelines

Multiple geedim commands can be chained together in a pipeline where image results from the previous command form inputs to the current command. For example, if the composite command is chained with download command, the created composite image will be downloaded, or if the search command is chained with the composite command, the search result images will be composited.

Common command options are also piped between chained commands. For example, if the config command is chained with other commands, the configuration specified with config will be applied to subsequent commands in the pipeline. Many command combinations are possible.

Examples

Composite two Landsat-7 images and download the result:

geedim composite -i LANDSAT/LE07/C02/T1_L2/LE07_173083_20100203 -i LANDSAT/LE07/C02/T1_L2/LE07_173083_20100219 download --bbox 22 -33.1 22.1 -33 --crs EPSG:3857 --scale 30

Composite the results of a Landsat-8 search and download the result.

geedim search -c l8-c2-l2 -s 2019-02-01 -e 2019-03-01 --bbox 23 -33 23.2 -33.2 composite -cm q-mosaic download --scale 30 --crs EPSG:3857

Composite the results of a Landsat-8 search, export to Earth Engine asset, and download the asset image.

geedim search -c l8-c2-l2 -s 2019-02-01 -e 2019-03-01 --bbox 23 -33 23.2 -33.2 composite -cm q-mosaic export --type asset --folder <your cloud project> --scale 30 --crs EPSG:3857 download

Search for Sentinel-2 SR images with a cloudless portion of at least 60%, using the qa mask-method to identify clouds:

geedim config --mask-method qa search -c s2-sr --cloudless-portion 60 -s 2022-01-01 -e 2022-01-14 --bbox 24 -34 24.5 -33.5

API

Example
import geedim as gd

gd.Initialize()  # initialise earth engine

# geojson polygon to search / download
region = {
    "type": "Polygon",
    "coordinates": [[[24, -33.6], [24, -33.53], [23.93, -33.53], [23.93, -33.6], [24, -33.6]]]
}

# make collection and search, reporting cloudless portions
coll = gd.MaskedCollection.from_name('COPERNICUS/S2_SR')
coll = coll.search('2019-01-10', '2019-01-21', region, cloudless_portion=0)
print(coll.schema_table)
print(coll.properties_table)

# create and download an image
im = gd.MaskedImage.from_id('COPERNICUS/S2_SR/20190115T080251_20190115T082230_T35HKC')
im.download('s2_image.tif', region=region)

# composite search results and download
comp_im = coll.composite()
comp_im.download('s2_comp_image.tif', region=region, crs='EPSG:32735', scale=30)

License

This project is licensed under the terms of the Apache-2.0 License.

Contributing

See the documentation for details.

Credits

geedim's People

Contributors

dugalh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

geedim's Issues

Export to google cloud storage bucket fails w/ `ee.ee_exception.EEException: Unknown configuration options: {'assetId': 'test'}`.

Thank you for this excellent tool! I was playing around with the new export function/types and ran into a minor issue with export to google cloud bucket. When using export() with ExportType "cloud" and filename e.g. "test" I get:

ee.ee_exception.EEException: Unknown configuration options: {'assetId': 'test'}

It appears that assetId is not an argument of Export.image.toCloudStorage(). I think it should work to set just description here:

https://github.com/dugalh/geedim/blob/7d8ecfcb2e4505192a3e19364fc0afcfda7452ae/geedim/download.py#L805-L808

Option to define nodata value [Feature request]

First, thank you for this awesome lib!
In the current implementation, the value used to represent nodata is defined automatically based on the data type. Offering the user the possibility to define it manually can be very useful. For example, if you try to export an image (uint8) with valid zero values, it will result in a bad behavior since 0 is used as nodata value for uint8. I faced a similar problem trying to export Google Dynamic World classes. To overcome this limitation, I had to unmask the image using the value 255 (value to represent nodata) and pass set_nodata = False to geedim. However, I lost the possibility to set 255 as nodata in the geotiff file metadata.

Unbounded search and download errors

  • Searching an unbounded collection without a region is not possible due to ee.Image.reduceRegion() in MaskedImage._set_region_stats(). The ee.EEException that is raised should be communicated to the user in a more helpful way.
  • MODIS NBAR images are unbounded but not recognised as such by the checks in BaseImage._prepare_for_export(), resulting in a cryptic EE error. The unbounded check should also check for a footprint that is global.

'BaseImage' object has no attribute 'dowwnload'

Very useful package! I am trying to incorporate it into geemap, but it is seems the BaseImage.download() function cannot be placed within a function. It throws an error 'BaseImage' object has no attribute 'dowwnload'. It works fine when it is not placed within a function. It is probably because multi-thread processes can only be run at the top level. Do you have a solution for this?

def download_ee_image(
    image,
    filename,
    scale=None,
    **kwargs,
):

    import geedim as gd
    
    if scale is not None:
        kwargs['scale'] = scale
    
    img = gd.download.BaseImage(image)
    img.dowwnload(filename, **kwargs)

`scale` parameter not working correctly when using `download` function

Hi, First of all, thank you very much for this wonderful repo. I am happy that, I am able to download the GEE image from my area of interest.

When I trying to download the sentinel2 imagery using below code, it downloads the image but the image doesn't have correct scale.

dataset = ee.ImageCollection('COPERNICUS/S2').filterDate('2021-11-01', '2021-11-30').filterBounds(ee_fc).filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE',5)).map(maskS2clouds)
dataset = dataset.median().select(['B4', 'B3', 'B2', 'B8', 'B11', 'B12']).clip(ee_fc)

img = geedim.download.BaseImage(dataset)
img.download(scale = 10, filename="aoi_s2.tif", dtype='float32', region=ee_fc.gemetry(), crs='EPSG:43426')

From this code, I got the image having following pixel size, which is kind of strange for me.

Dimensions	X: 441 Y: 188 Bands: 6
Origin	80.4700950046142083,29.1321850009392591
Pixel Size	8.983152841195391012e-05,-8.983152841195391012e-05

Am I doing anything wrong here?

download of specific bands by clipping to a region fails

Issue : Downloading the clipped tile by passing the region fails with the below error message

User limit exceeded, decrease the max_tile_size or max_tile_dim limit

When I set the region to None there is no issue.

Below is the code i use to download the data

  gee_img = gd.MaskedImage.from_id(fc_df["ID")
  download_bands_from_gee(gee_img, geom, band_names=['B4','B8','SCL'])
def download_all_bands_from_gee(collection,geom=None, ofilename=None):
    if ofilename is None:
        ofilename = pathlib.Path(collection.id).stem + ".tif"

    collection.download(ofilename, region=geom,overwrite = True)

I search and get the list of tiles and loop through the tiles.



def download_bands_from_gee(collection, geom, band_names=0):
    
    coll = collection.ee_image
    img = coll.select(band_names)
    info = img.getInfo()
    band_name = info["bands"][0]["id"]
    proj_target = img.select(0).projection()
    crs_target = proj_target.getInfo()['crs']
    ofilename = pathlib.Path(collection.id).stem + "_" + band_name + ".tif"
    collection_new = gd.MaskedImage(img)
    
    download_all_bands_from_gee(collection_new, None,  ofilename)

Scale issue

The follow code will throw an error: EEException: Can't convert a computed geometry to GeoJSON. Use getInfo() instead. However, if the scale is changed to a value larger than 30 (e.g., 60), it works fine.

import geedim as gd
image = ee.Image('LANDSAT/LC09/C02/T1_L2/LC09_023031_20220617')
region = image.geometry()
image = image.multiply(0.0000275).add(-0.2).set(image.toDictionary())
img = gd.download.BaseImage(image)
img.download('landsat.tif', region=region, scale=30)

Sentinel-2 masking fails when missing cloud data.

I am having trouble using geedim with a specific image from the harmonized sentinel 2 collection, namely COPERNICUS/S2_SR_HARMONIZED/20230809T201859_20230809T201853_T10WEE. Whenever this images comes up in a search collection, i cannot process the collection for example to extract data with error: Image.rename: Parameter 'input' is required.

I do not know whats specifically different with this image. I drilled that down to a problem with the processing geedim applies to the image when using the region parameter but i cannot find the problem itself.

i am using a conda environment with

python                    3.11.9               he1021f5_0
earthengine-api           0.1.408            pyhd8ed1ab_0    conda-forge
geedim                    1.8.0              pyhd8ed1ab_0    conda-forge

I tested it with a bunch of images in a specific region and different region polygons. I made an example using getInfo() on the resulting image to trigger the processing, but it applies to any other client-side GEE method including geedims download():

import ee
import geedim

ee.Initialize()

poly_coords1 = [
    [-121.30224921178123,71.77255565243381],
    [-121.69226386021873,71.58260352221238],
    [-120.87378241490623,71.52698680217173],
    [-120.86828925084373,71.70370209220792],
    [-121.30224921178123,71.77255565243381],
]
poly_coords2 = [
    [-121.65930487584373,71.92795376863671],
    [-121.70874335240623,71.82200371558541],
    [-121.16492011021873,71.84769170326668],
    [-121.45056464146873,71.90237555398673],
    [-121.65930487584373,71.92795376863671],
]
poly_coords3 = [
    [-120.97265936803123,71.98070472145938],
    [-120.98913886021873,71.73784463314388],
    [-120.08276678990623,71.74472884706717],
    [-120.28052069615623,71.98750044350669],
    [-120.97265936803123,71.98070472145938],
]

bounds = [
    ee.Geometry.Polygon(list(poly_coords1), proj=None, evenOdd=False),
    ee.Geometry.Polygon(list(poly_coords2), proj=None, evenOdd=False),
    ee.Geometry.Polygon(list(poly_coords3), proj=None, evenOdd=False),
]

images = [
    "20230801T200901_20230801T200857_T10WEE",
    "20230802T202849_20230802T202847_T10WEE",
    "20230803T195859_20230803T200054_T10WEE",
    "20230804T201851_20230804T201853_T10WEE",
    "20230804T201851_20230804T202447_T10WEE",
    "20230806T200859_20230806T201056_T10WEE",
    "20230807T202851_20230807T203151_T10WEE",
    "20230808T195901_20230808T195945_T10WEE",
    "20230811T200901_20230811T200857_T10WEE",
    "20230812T202849_20230812T202934_T10WEE",
    "20230813T195859_20230813T195900_T10WEE",
    "20230814T201851_20230814T202428_T10WEE",
    "20230809T201859_20230809T201853_T10WEE", # <- this one is broken
]

for i,poly in enumerate(bounds):
    print(f"Polygon {i}:")
    for image_id in images:
        try:
            ( geedim.MaskedImage
                .from_id("COPERNICUS/S2_SR_HARMONIZED/" + image_id, region=poly)
                .ee_image.getInfo()
            )
            print(f" {image_id} -> √")
        except ee.ee_exception.EEException as e:
            print(f" {image_id} -> error: {e}")

which produces

Polygon 0:
 20230801T200901_20230801T200857_T10WEE -> √
 20230802T202849_20230802T202847_T10WEE -> √
 20230803T195859_20230803T200054_T10WEE -> √
 20230804T201851_20230804T201853_T10WEE -> √
 20230804T201851_20230804T202447_T10WEE -> √
 20230806T200859_20230806T201056_T10WEE -> √
 20230807T202851_20230807T203151_T10WEE -> √
 20230808T195901_20230808T195945_T10WEE -> √
 20230811T200901_20230811T200857_T10WEE -> √
 20230812T202849_20230812T202934_T10WEE -> √
 20230813T195859_20230813T195900_T10WEE -> √
 20230814T201851_20230814T202428_T10WEE -> √
 20230809T201859_20230809T201853_T10WEE -> error: Image.rename: Parameter 'input' is required.
Polygon 1:
 20230801T200901_20230801T200857_T10WEE -> √
 20230802T202849_20230802T202847_T10WEE -> √
 20230803T195859_20230803T200054_T10WEE -> √
 20230804T201851_20230804T201853_T10WEE -> √
 20230804T201851_20230804T202447_T10WEE -> √
 20230806T200859_20230806T201056_T10WEE -> √
 20230807T202851_20230807T203151_T10WEE -> √
 20230808T195901_20230808T195945_T10WEE -> √
 20230811T200901_20230811T200857_T10WEE -> √
 20230812T202849_20230812T202934_T10WEE -> √
 20230813T195859_20230813T195900_T10WEE -> √
 20230814T201851_20230814T202428_T10WEE -> √
 20230809T201859_20230809T201853_T10WEE -> error: Image.rename: Parameter 'input' is required.
Polygon 2:
 20230801T200901_20230801T200857_T10WEE -> √
 20230802T202849_20230802T202847_T10WEE -> √
 20230803T195859_20230803T200054_T10WEE -> √
 20230804T201851_20230804T201853_T10WEE -> √
 20230804T201851_20230804T202447_T10WEE -> √
 20230806T200859_20230806T201056_T10WEE -> √
 20230807T202851_20230807T203151_T10WEE -> √
 20230808T195901_20230808T195945_T10WEE -> √
 20230811T200901_20230811T200857_T10WEE -> √
 20230812T202849_20230812T202934_T10WEE -> √
 20230813T195859_20230813T195900_T10WEE -> √
 20230814T201851_20230814T202428_T10WEE -> √
 20230809T201859_20230809T201853_T10WEE -> error: Image.rename: Parameter 'input' is required.

Yearly composite problem

Hello. I have a problem with uploading an annual composite to a certain territory. When part of the image is loaded, it gives me an error:

Error downloading tile: Expected a homogeneous image collection, but an image with incompatible bands was encountered: First image type: 23 bands ([B1, B2, B3, B4, B5, B6, B7, B8, B8A, B9, B11, B12, AOT, WVP, SCL, TCI_R, TCI_G, TCI_B, MSK_CLDPRB, MSK_SNWPRB, QA10, QA20, QA60]). Current image type: 21 bands ([B1, B2, B3, B4, B5, B6, B7, B8, B8A, B9, B11, B12, AOT, WVP, SCL, TCI_R, TCI_G, TCI_B, QA10, QA20, QA60]).

Is there a way around this error?

BadZipFile: File is not a zip file

I used a library called geemap. One of the functions in this library, geemap.download_ee_image, uses the geedim library to download image results calculated through the Earth Engine API.

However, when I use this function to download the image, I can see the progress bar increasing, but when it reaches a certain download progress, it throws an error.

BadZipFile Traceback (most recent call last)
Cell In[23], line 37
34 temp_image.select(["SR_B1", "SR_B2", "SR_B3", "SR_B4","SR_B5", "SR_B7"])
36 y1=1990+i
---> 37 geemap.download_ee_image(temp_image, filename=f'landsat_image_{y1}.tif', scale=30, region=boundary.geometry(),crs='EPSG:4326')
40 for i in range(0,11):
41 start_time=ee.Date('2013-06-01').advance(i,'year')

File d:\anaconda3\envs\myenv\lib\site-packages\geemap\common.py:12925, in download_ee_image(image, filename, region, crs, crs_transform, scale, resampling, dtype, overwrite, num_threads, max_tile_size, max_tile_dim, shape, scale_offset, unmask_value, **kwargs)
12922 kwargs["scale_offset"] = scale_offset
12924 img = gd.download.BaseImage(image)

12925 img.download(filename, overwrite=overwrite, num_threads=num_threads, **kwargs)

File d:\anaconda3\envs\myenv\lib\site-packages\geedim\download.py:955, in BaseImage.download(self, filename, overwrite, num_threads, max_tile_size, max_tile_dim, **kwargs)
953 logger.info(f'Exception: {str(ex)}\nCancelling...')
954 executor.shutdown(wait=False)
--> 955 raise ex
957 bar.update(bar.total - bar.n) # ensure the bar reaches 100%
958 # populate GeoTIFF metadata

File d:\anaconda3\envs\myenv\lib\site-packages\geedim\download.py:951, in BaseImage.download(self, filename, overwrite, num_threads, max_tile_size, max_tile_dim, **kwargs)
949 try:
950 for future in as_completed(futures):
--> 951 future.result()
952 except Exception as ex:
953 logger.info(f'Exception: {str(ex)}\nCancelling...')

File d:\anaconda3\envs\myenv\lib\concurrent\futures_base.py:437, in Future.result(self, timeout)
435 raise CancelledError()
436 elif self._state == FINISHED:
--> 437 return self.__get_result()
439 self._condition.wait(timeout)
441 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:

File d:\anaconda3\envs\myenv\lib\concurrent\futures_base.py:389, in Future.__get_result(self)
387 if self._exception:
388 try:
--> 389 raise self._exception
390 finally:
391 # Break a reference cycle with the exception in self._exception
392 self = None

File d:\anaconda3\envs\myenv\lib\concurrent\futures\thread.py:57, in _WorkItem.run(self)
54 return
56 try:
---> 57 result = self.fn(*self.args, **self.kwargs)
58 except BaseException as exc:
59 self.future.set_exception(exc)

File d:\anaconda3\envs\myenv\lib\site-packages\geedim\download.py:941, in BaseImage.download..download_tile(tile)
939 def download_tile(tile):
940 """Download a tile and write into the destination GeoTIFF. """
--> 941 tile_array = tile.download(session=session, bar=bar)
942 with out_lock:
943 out_ds.write(tile_array, window=tile.window)

File d:\anaconda3\envs\myenv\lib\site-packages\geedim\tile.py:121, in Tile.download(self, session, response, bar)
118 zip_buffer.flush()
120 # extract geotiff from zipped buffer into another buffer
--> 121 zip_file = zipfile.ZipFile(zip_buffer)
122 ext_buffer = BytesIO(zip_file.read(zip_file.filelist[0]))
124 # read the geotiff with a rasterio memory file

File d:\anaconda3\envs\myenv\lib\zipfile.py:1269, in ZipFile.init(self, file, mode, compression, allowZip64, compresslevel, strict_timestamps)
1267 try:
1268 if mode == 'r':
-> 1269 self._RealGetContents()
1270 elif mode in ('w', 'x'):
1271 # set the modified flag so central directory gets written
1272 # even if no files are added to the archive
1273 self._didModify = True

File d:\anaconda3\envs\myenv\lib\zipfile.py:1336, in ZipFile._RealGetContents(self)
1334 raise BadZipFile("File is not a zip file")
1335 if not endrec:
-> 1336 raise BadZipFile("File is not a zip file")
1337 if self.debug > 1:
1338 print(endrec)

BadZipFile: File is not a zip file

How can I solve this problem? Is there a better way to directly download Earth Engine image objects?

Support for BIGTIFF/other custom GDAL options

Thanks for the excellent package! I've been putting it to good use lately.

I noticed that if a download results in a combined GeoTIFF >4GB (after 'deflate' compression), writing to the file with rasterio will result in this GDAL error:

ERROR 1: TIFFAppendToStrip:Maximum TIFF file size exceeded. Use BIGTIFF=YES creation option.

From reading https://gdal.org/drivers/raster/gtiff.html#creation-options, I see GDAL defaults to BIGTIFF=IF_NEEDED, which incidentally never results in BIGTIFF format file when a compression method is being used.

Since geedim specifies "deflate" compression and other GDAL dataset write options, but not BIGTIFF, lengthy downloads will fail part way through if/when the file reaches ~4GB. I think there probably would be little risk to just setting BIGTIFF='YES' in profile from _prepare_for_download() but perhaps IF_SAFER would be better. It is my understanding that some non-GDAL software still does not support libtiff > 4 / BIGTIFF.

More generally, if the user could decide which (if any) compression method they want to use, or have access to set/modify other GDAL options, then setting BIGTIFF for some data sets/extents would be up to the user

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.