Code Monkey home page Code Monkey logo

rio-cogeo's Introduction

rio-cogeo

COG

Cloud Optimized GeoTIFF (COG) creation and validation plugin for Rasterio.

Test Coverage Package version Conda Forge Downloads Downloads


Documentation: https://cogeotiff.github.io/rio-cogeo/

Source Code: https://github.com/cogeotiff/rio-cogeo


Cloud Optimized GeoTIFF

This plugin aims to facilitate the creation and validation of Cloud Optimized GeoTIFF (COG or COGEO). While it respects the COG specifications, this plugin also enforces several features:

  • Internal overviews (User can remove overview with option --overview-level 0)
  • Internal tiles (default profiles have 512x512 internal tiles)

Important: in GDAL 3.1 a new COG driver has been added (doc, discussion), starting with rio-cogeo version 2.2, --use-cog-driver option was added to create COG using the COG driver.

Install

$ pip install -U pip
$ pip install rio-cogeo

Or install from source:

$ pip install -U pip
$ pip install git+https://github.com/cogeotiff/rio-cogeo.git

GDAL Version

It is recommended to use GDAL > 2.3.2. Previous versions might not be able to create proper COGs (ref: OSGeo/gdal#754).

More info in #55

More

Blog post on good and bad COG formats: https://medium.com/@_VincentS_/do-you-really-want-people-using-your-data-ec94cd94dc3f

Checkout rio-glui or rio-viz rasterio plugins to explore COG locally in your web browser.

Contribution & Development

See CONTRIBUTING.md

Changes

See CHANGES.md.

License

See LICENSE

rio-cogeo's People

Contributors

alexismanin avatar drnextgis avatar geospatial-jeff avatar j08lue avatar kant avatar kylebarron avatar mentaljam avatar mplough-kobold avatar perliedman avatar pierotofy avatar richardscottoz avatar robmarkcole avatar rukku avatar sellersevan avatar sgillies avatar vincentsarago avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rio-cogeo's Issues

Float32 data promoted to Float64

I used rio-cogeo to convert a geotiff with 1 band of Type=Float32 geotiff. To my surprise, the output generated was Type=Float64 geotiff. Any idea why that happened and how to keep the data as Type=Float32?

Here's the actual rio-cogeo command I used:

rio cogeo create --cog-profile lzw Southern_California_Topobathy_DEM_1m.tif Southern_California_Topobathy_DEM_1m_COG.tif

And here's the gdalinfo for the INPUT file:

Driver: GTiff/GeoTIFF
Files: Southern_California_Topobathy_DEM_1m.tif
       Southern_California_Topobathy_DEM_1m.tif.ovr
       Southern_California_Topobathy_DEM_1m.tif.aux.xml
Size is 316120, 225217
Coordinate System is:
PROJCS["NAD_1983_NSRS2007_UTM_Zone_11N",
    GEOGCS["GCS_NAD_1983_NSRS2007",
        DATUM["NAD_1983_NSRS2007",
            SPHEROID["GRS_1980",6378137,298.257222101]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",-117],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]]]
Origin = (179523.999999998224666,3824832.000000000000000)
Pixel Size = (1.000000000000000,-1.000000000000000)
Metadata:
  AREA_OR_POINT=Area
  DataType=Generic
Image Structure Metadata:
  COMPRESSION=LZW
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  179524.000, 3824832.000) (120d29'26.74"W, 34d30'55.16"N)
Lower Left  (  179524.000, 3599615.000) (120d24'36.73"W, 32d29'15.49"N)
Upper Right (  495644.000, 3824832.000) (117d 2'50.95"W, 34d33'54.85"N)
Lower Right (  495644.000, 3599615.000) (117d 2'47.00"W, 32d32' 1.94"N)
Center      (  337584.000, 3712223.500) (118d44'57.15"W, 33d32'14.24"N)
Band 1 Block=128x128 Type=Float32, ColorInterp=Gray
  Min=-3857.312 Max=3067.828
  Minimum=-3857.312, Maximum=3067.828, Mean=-353.493, StdDev=804.656
  NoData Value=-3.40282306073709653e+38
  Overviews: 79030x56305, 39515x28153, 19758x14077, 9879x7039, 4940x3520, 2470x1760, 1235x880, 618x440, 309x220
  Metadata:
    RepresentationType=ATHEMATIC
    STATISTICS_COVARIANCES=647471.7255957157
    STATISTICS_MAXIMUM=3067.8276367188
    STATISTICS_MEAN=-353.49345874696
    STATISTICS_MINIMUM=-3857.3120117188
    STATISTICS_SKIPFACTORX=1
    STATISTICS_SKIPFACTORY=1
    STATISTICS_STDDEV=804.65627791978

And the gdalinfo for the OUTPUT file:

 Driver: GTiff/GeoTIFF
Files: tmp2cd_5blz.tif
Size is 316120, 225217
Coordinate System is:
PROJCS["NAD_1983_NSRS2007_UTM_Zone_11N",
    GEOGCS["GCS_NAD_1983_NSRS2007",
        DATUM["NAD_1983_NSRS2007",
            SPHEROID["GRS_1980",6378137,298.257222101]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",-117],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]]]
Origin = (179523.999999998224666,3824832.000000000000000)
Pixel Size = (1.000000000000000,-1.000000000000000)
Metadata:
  AREA_OR_POINT=Area
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  179524.000, 3824832.000) (120d29'26.74"W, 34d30'55.16"N)
Lower Left  (  179524.000, 3599615.000) (120d24'36.73"W, 32d29'15.49"N)
Upper Right (  495644.000, 3824832.000) (117d 2'50.95"W, 34d33'54.85"N)
Lower Right (  495644.000, 3599615.000) (117d 2'47.00"W, 32d32' 1.94"N)
Center      (  337584.000, 3712223.500) (118d44'57.15"W, 33d32'14.24"N)
Band 1 Block=512x512 Type=Float64, ColorInterp=Gray
  NoData Value=-3.40282306073709653e+38
  Overviews: 158060x112609, 79030x56305, 39515x28153, 19758x14077, 9879x7039, 4940x3520, 2470x1760, 1235x880, 618x440

using `cog_translate` on netCDF4 with improper georeferencing

I am trying to use cog_translate to convert some netCDF4 datasets to COG, and it appears these files do not have proper georeferencing.

I have modified cogeo.py to accept vrt_params, so that I can manually pass the src_crs and src_transform via the WarpedVRTReaderBase mixin of WarpedVRT, as you can see here:

https://github.com/ryanjdillon/rio-cogeo/blob/cog_translate_vrt_params/rio_cogeo/cogeo.py#L87-L99

Then I'm passing the following vrt_params:

vrt_params = dict(src_transform=Affine(gdal_geotransform), src_crs='EPSG:4326')

This get's me past the CRS is None error, but it doesn't seem to acknowledge the Affine transform I pass to src_transform, resulting in the following error:

CPLE_AppDefinedError: The transformation is already "north up" or a transformation between pixel/line and georeferenced coordinates cannot be computed for /home/ryan/data/webstep/kvt/NVE_Sorlandet/NVE_Sorlandet_wslev0_xyt_synth.nc4. There is no affine transformation and no GCPs. Specify transformation option SRC_METHOD=NO_GEOTRANSFORM to bypass this check.

Any suggestions on how I might get this to work?

ZSTD missing codec

Thanks for this lib/plugin, it's really helpful and avoids having to guess which creation options are good for remote access.

I'm encountering a problem with the zstd cog-profile.

When running:

rio cogeo create -p zstd S2A_MSIL1C_20190112T175711_N0207_R141_T13UFQ_20190112T194934__B01.jp2 out.tiff

I get the following traceback:

Reading input: /home/guillaumelostis/code/s2-cog-benchmark/tiles/S2A_MSIL1C_20190112T175711_N0207_R141_T13UFQ_20190112T194934__B01.jp2
Adding overviews...
Updating dataset tags...
Writing output to: /home/guillaumelostis/code/s2-cog-benchmark/tiles/tiff/zstd.tiff
WARNING:rasterio._env:CPLE_NotSupported in 'ZSTD' is an unexpected value for COMPRESS creation option of type string-select.
Traceback (most recent call last):
  File "/home/guillaumelostis/.venv/bench/bin/rio", line 11, in <module>
    sys.exit(main_group())
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/rio_cogeo/scripts/cli.py", line 191, in create
    quiet=quiet,
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/rio_cogeo/cogeo.py", line 268, in cog_translate
    copy(tmp_dst, dst_path, copy_src_overviews=True, **dst_kwargs)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/rasterio/env.py", line 445, in wrapper
    return f(*args, **kwds)
  File "rasterio/shutil.pyx", line 139, in rasterio.shutil.copy
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_AppDefinedError: Cannot create TIFF file due to missing codec for ZSTD.

I run this in a fresh virtual environment where I have pip install rio-cogeo.
Is the problem that the GDAL lib that came packaged with the rasterio wheel doesn't support ZSTD? That seems surprising. For reference, the rasterio wheel that was installed is https://files.pythonhosted.org/packages/e1/2e/af9bfa901b890800b25428ab9846fe4e91818a94ede227c1a2d4410bac54/rasterio-1.0.28-cp36-cp36m-manylinux1_x86_64.whl

Or is something wrong with my local setup?

add logging for creation steps ?

mentioned in #46 we have a progressbar only for the first step of the cog creation which is usually the fastest steps. The seconds steps (overview and compression) can hangs for a long time (especially when working with complexes compression).
IMO we could add some logging to tell the user what's going on:

actual

$ rio cogeo 2003031.tif test.tif
  [####################################]  100%

future

$ rio cogeo 2003031.tif test.tif
---------------------
Input file: 2003031.tif
Output file:  test.tif
Profile: YCBCR (JPEG)
Options: ...
---------------------
Creating internal tiles...
  [####################################]  100%
Adding overviews...
Writing output file... 

keep 128 as default GDAL_TIFF_OVR_BLOCKSIZE

While working on the web-optimized (ref: #62) option I realized I made a bad choice from the beginning. in

GDAL_TIFF_OVR_BLOCKSIZE=os.environ.get("GDAL_TIFF_OVR_BLOCKSIZE", block_size),
we set GDAL_TIFF_OVR_BLOCKSIZE to be the same as the internal tile for the high resolution data. This is usually ok, but in a case of web-optimized COG, it will result in GDAL having to fetch more tiles than needed.

๐Ÿ‘‡ here we created a COG with internal tiles (256px for high resolution and overviews) aligned with web mercator grid (left). But the right image show that the overview (level 1) internal tiles are not aligned with the mercator grid at zoom - 1 (this is in fact normal).
Capture dโ€™รฉcran, le 2019-03-12 ร  23 37 33

Having internal tiles not aligned for overview is normal, but if we look at the figure ๐Ÿ‘‡, if we try to fetch the data for mercator tile 17-118594-60034, GDAL will fetch 2 256x256px internal tiles (0,0 and 1,0) while the mercator tile only cover 1/4 of the area for those tiles.

Capture dโ€™รฉcran, le 2019-03-12 ร  23 29 36

Because GDAL can merge http call for adjacent range requests, having different internal tiles size for overviews will not create any harms and should speed up the performance.

โš ๏ธ
I'm inclined to update the CLI and add a --overview-tilesize option with default to GDAL_TIFF_OVR_BLOCKSIZE env and a fallback to 128.

This will be a breaking change and should happen before 1.0.0

non-ascii character in README prevents installation

Line 227 of the README reads:
โš ๏ธ GDAL>=3 is not yet supported by rasterio

Installation fails due to the non-ascii character:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 7166: ordinal not in range(128)

How to convert geotiffs programmatically to cogs

After I have verified the successful run of the command line rio cogeo example.tif example_cog_raw.tif --cog-profile raw then I'm trying to use the function cog_translate programmatically.

This snippet is failing:

cog_img = "cog" + "_" + name # name is for instance example.tif
cog_profile = cog_profiles.get('raw')
cog_profile.update(dict(BIGTIFF=os.environ.get("BIGTIFF", "IF_SAFER")))
block_size = 512
config = dict(
            NUM_THREADS=8,
            GDAL_TIFF_INTERNAL_MASK=os.environ.get("GDAL_TIFF_INTERNAL_MASK", True),
            GDAL_TIFF_OVR_BLOCKSIZE=os.environ.get("GDAL_TIFF_OVR_BLOCKSIZE", block_size),
)
cog_translate(
            img,
            cog_img,
            cog_profile,
            None,
            None,
            None,
            6,
            config
)

where img the binary image file which is properly opened previously by rasterio and checked for being or not a COG.
The error stack trace is:

File "/Users/geobart/Development/CallForCode/cog-k8s/app/api_views.py", line 87, in post
    block_size = 512
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rio_cogeo/cogeo.py", line 52, in cog_translate
    with rasterio.open(src_path) as src:
  File "/Users/geobart/.pyenv/versions/3.6.2/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rasterio/__init__.py", line 178, in fp_reader
    dataset = memfile.open()
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rasterio/env.py", line 360, in wrapper
    return f(*args, **kwds)
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rasterio/io.py", line 132, in open
    writer = get_writer_for_driver(driver)
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rasterio/io.py", line 179, in get_writer_for_driver
    if driver_can_create(driver):
  File "rasterio/_base.pyx", line 112, in rasterio._base.driver_can_create
  File "rasterio/_base.pyx", line 95, in rasterio._base.driver_supports_mode
AttributeError: 'NoneType' object has no attribute 'encode'

Do you have any hints on the cause of the error and if I'm using cogeo in a wrong way?

Remove useless Boundless=True options to avoid rasterio warning

in https://github.com/mapbox/rio-cogeo/blob/186a7208f0d5ef0c6cfbd87ae6e7d2047acc6076/rio_cogeo/cogeo.py#L72-L85 we are using boundless=True option when reading the dataset but it is not needed because windows shouldn't be boundless.

rio cogeo raw.tif cog_ycbcr.tif -p ycbcr'
  [------------------------------------]    1%WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
  [#-----------------------------------]    3%WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
...

copy dataset to create valid COG

This is not a bug report, rather a question on the implementation:

When the COG is written, the temporary file gets copied to the defined output in the end:

copy(tmp_dst, dst_path, copy_src_overviews=True, **dst_kwargs)

Is there a specific reason related to the validity of the resulting COG containing overviews? The reason I ask is because I tried to implement a small script which creates COGs (before I discovered this tool) but the output files were not valid COGs as soon as they had overviews.

To me it seems the only difference between the implementations is that here the output is written into a temporary file, then the overviews are generated and finally the temporary file is copied to disk. This seems to be a crucial step for a valid COG as far as I understand, right?

Remove default bidx option and take all dataset bands by default (if possible)

Right now, bidx option is set to be 1,2,3 which is fine when dataset is rgb or rgba. When working on 1 band data the cli will raise an error:

  File "rasterio/_io.pyx", line 233, in rasterio._io.DatasetReaderBase.read
IndexError: band index 2 out of range (not in (1,))

Solution:

  • remove bidx default and take src dataset index as default.

โ˜๏ธ that's said we should filter to make sure we don't ingest alpha band (because we have a mask in the output dataset)

do not copy alpha band

When a source has an alpha band we should be able to detect it and not copy it to the output file because it will be replaced by the internal mask.

Make internal mask an option ?

Right now we are creating a internal bit mask from alpha/nodata/mask by default, but there is multiple problem with this approach.
https://github.com/mapbox/rio-cogeo/blob/eabf1203be691c2937d89e08a6e62349b594eca1/rio_cogeo/cogeo.py#L75-L86

Well we tend to create the simplest COG (author opinion here) rio-cogeo should maybe offer more flexibility especially for compression like WEBP which can handle RGBA data without needing internal mask.

Maybe the first step is to work on the documentation side of rio-cogeo to explain what are the pro/cons to internal mask and also define good habits for COG format

Incorrect context behavior

why do you add the DatasetReader to the ExitStack, too? Just for cleanup, so it __exit__s as soon as possible? Couldn't that cause trouble if the user does like you envisioned and the DatasetReader exits twice?:

with MemoryFile() as memfile:
    with memfile.open(**src_profile) as mem:
         mem.write(data)
         cog_translate(mem, *args, **kwargs)

Originally posted by @j08lue in #93

Add note to help user add statistics to the output dataset

ref #19

import rasterio

with rasterio.open("my-data.tif", "r+") as src_dst:
    for b in src_dst.indexes:
        band = src_dst.read(indexes=b, masked=masked)
        stats = {
            'min': float(band.min()),
            'max': float(band.max()),
            'mean': float(band.mean())
            'stddev': float(band.std())
        }
        src_dst.update_tags(b, **stats)

COGs built from VRT don't validate

After building a COG from a vrt:

rio cogeo create my.vrt my.tif -p jpeg

the COG doesn't validate:

> rio cogeo validate /vsis3/bucket/my.tif
The following errors were found:
- The offset of the IFD for overview of index 1 is 884184, whereas it should be greater than the one of index 0, which is at byte 3167886
- The offset of the IFD for overview of index 2 is 299436, whereas it should be greater than the one of index 1, which is at byte 884184
- The offset of the IFD for overview of index 3 is 149834, whereas it should be greater than the one of index 2, which is at byte 299436
- The offset of the IFD for overview of index 4 is 107736, whereas it should be greater than the one of index 3, which is at byte 149834
- The offset of the IFD for overview of index 5 is 94524, whereas it should be greater than the one of index 4, which is at byte 107736
- The offset of the first block of the smallest overview should be after its IFD
/vsis3/bucket/my.tif is NOT a valid cloud optimized GeoTIFF

The VRT points to a group of 256px tiffs stored in S3.

1.0.0 release

Target date: 2019-03-29.

๐ŸŽ‰ This is almost ready. we don't have anything to add for the 1.0 release.

I'm targeting Friday 29th for the official release to see if we get feedback on the latest beta release

rio-cogeo/CHANGES.txt

Lines 1 to 6 in 5974cd5

1.0b2 (2019-03-27)
------------------
Breacking Changes:
- Switch from JPEG to DEFLATE as default profile in CLI (#66)

release 1.0b0

With #6 merged we now have a tool to create and validate CloudOptimized Geotiff.

I don't have any breaking change in mind, and the only PR blocked #62 will have to wait after the 1.0.0 release (If we get some help).

I'm planning to do a 1.0b0 release on Thursday 14th.

Here is the list of changes since last release:

rio-cogeo/CHANGES.txt

Lines 1 to 16 in a6b76c7

Next (TBD)
----------
- add more logging and `--quiet` option (#46)
- add `--overview-blocksize` to set overview's internal tile size (#60)
Bug fixes:
- copy tags and description from input to output (#19)
- copy input mask band to output mask
Breacking Changes:
- rio cogeo now has subcommands: 'create' and 'validate' (#6).
- internal mask creation is now optional (--add-mask).
- internal nodata or alpha channel can be forwarded to the output dataset.
- removed default overview blocksize to be equal to the raw data blocksize (#60)

Vincent

Error when passing options block size and no overview-level passed

rio cogeo lopegeo_mlk_005.tif lopegeo_mlk_005_cogeo.tif --cog-profile deflate --co BLOCKXSIZE=256 --co BLOCKYSIZE=256
Traceback (most recent call last):
  File "/Users/vincentsarago/Workspace/venv/py37/bin/rio", line 11, in <module>
    load_entry_point('rasterio', 'console_scripts', 'rio')()
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/Users/vincentsarago/Workspace/CogeoTiff/rio-cogeo/rio_cogeo/scripts/cli.py", line 108, in cogeo
    config,
  File "/Users/vincentsarago/Workspace/CogeoTiff/rio-cogeo/rio_cogeo/cogeo.py", line 58, in cog_translate
    src_path, min(dst_kwargs["blockxsize"], dst_kwargs["blockysize"])
  File "/Users/vincentsarago/Workspace/CogeoTiff/rio-cogeo/rio_cogeo/utils.py", line 15, in get_maximum_overview_level
    while min(width // overview, height // overview) > minsize:
TypeError: '>' not supported between instances of 'int' and 'str'

Configurable minimum internal tile size for valid COG

The COG specification does not fix a hard requirement about the internal tiling size but the validation script does.

The internal tile size is important for COG because it will determine if the COG might need overviews or not (but they are optional).

IMO the internal tile size should be configurable in the validate script because some user could consider a 1024x1024 file without internal tiling nor internal overview as proper cog while the script will raise about not having internal tiling and warns about not having overviews.

when using jpeg profile silently creates empty output if bit depth != 8

Vrersion:
$ pip freeze | grep rio
rasterio==1.0.22
rio-cogeo==1.0b3

Problem: When creating from a unsigned int 16 bit per pixel tif a jpeg profile / jpeg compressed cogeo, successfully creates an empty file.

What I expected: an error message indicating this can only work for 8 bits per pixel.

Details:

When doing the same thing with gdal directly I get warnings that JPEG compression is not compatible with 16 bit pixel depth (using UINT16 Tiff bands), but rio cogeo silently swallows these warnings:

gdal_translate temp.tif gdal_cog.tif -co TILED=YES -co COMPRESS=JPEG -co PHOTOMETRIC=YCBCR -co COPY_SRC_OVERVIEWS=YES
Input file size is 4171, 4061
0ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
..ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
.10...20.ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
..30...40...50...60...70...80...90...100 - done.
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
 $ rio cogeo create --cog-profile jpeg temp.tif cog_jpeg.tif
Reading input: /home/hiro/git/tilingtest/temp.tif
  [####################################]  100%
Adding overviews...
Updating dataset tags...
Writing output to: /home/hiro/git/tilingtest/cog_jpeg.tif
$ ls -l cog_jpeg.tif 
-rw-------. 1 hiro hiro 4992 Mar 31 13:02 cog_jpeg.tif
$ ls -l temp.tif
-rw-------. 1 hiro hiro 138236380 Mar 31 12:32 temp.tif
$ rio info temp.tif
{"bounds": [694696.0, 3674712.0, 736406.0, 3715322.0], "colorinterp": ["red", "green", "blue"], "count": 3, "crs": "EPSG:32636", "descriptions": [null, null, null], "driver": "GTiff", "dtype": "uint16", "height": 4061, "indexes": [1, 2, 3], "interleave": "pixel", "lnglat": [35.316978660693565, 33.37281747525934], "mask_flags": [["all_valid"], ["all_valid"], ["all_valid"]], "nodata": null, "res": [10.0, 10.0], "shape": [4061, 4171], "tiled": false, "transform": [10.0, 0.0, 694696.0, 0.0, -10.0, 3715322.0, 0.0, 0.0, 1.0], "units": [null, null, null], "width": 4171}

expose array+profile API

I often want to write data from numpy arrays as a cloud-optimized GeoTIFF, together with the usual rasterio profile dictionary (transform, crs, width, height, dtype, etc.). It would be super nice to have a Python API for this.

This would require splitting the large cogeo.cog_translate function into two at least, roughly here:

matrix = vrt_dst.read(window=w, indexes=indexes)

or, if that becomes too complicated, create a separate function for this, like

def cog_write(
    src_data,
    src_profile,
    dst_path,
    nodata=None,
    add_mask=None,
    overview_level=None,
    overview_resampling="nearest",
    web_optimized=False,
    latitude_adjustment=True,
    resampling="nearest",
    in_memory=None,
    config=None,
    quiet=False,
):

What do you think? Would this make sense to add to this package? Or will this package anyways be superseded by the new COG driver in GDAL?

[BUG] updating tags after creation breaks COG specification

in 8b24785 we introduced a bug. Re-openning the dataset with r+ to add rio_overview tag value seems to mess-up with the directory

gdal validate_cloud_optimized_geotiff.py then gives:

python validate_cloud_optimized_geotiff.py cogeo.tif
cogeo.tif is NOT a valid cloud optimized GeoTIFF.
The following errors were found:
 - The offset of the main IFD should be 8 for ClassicTIFF or 16 for BigTIFF. It is 7969940 instead
 - The offset of the IFD for overview of index 0 is 3380, whereas it should be greater than the one of the main image, which is at byte 7969940

cc @sgillies @perrygeo

requires COMPRESS=JPEG with COMPRESS=LZW option

I am working with the following file and trying to convert it to cloud-optimized geotiff.

https://s3.amazonaws.com/share-terravion-com/2018_cog/S2A_MSIL2A_20171206T171701_N0206_R112_T15TWK_20171206T190908_MULTIBAND.tif

The following is the command:
rio cogeo S2A_MSIL2A_20171206T171701_N0206_R112_T15TWK_20171206T190908_MULTIBAND.tif S2A_MSIL2A_20171206T171701_N0206_R112_T15TWK_20171206T190908_MULTIBAND_cog.tif --co "COMPRESS=LZW" --co BLOCKXSIZE=256 --co BLOCKYSIZE=256

Here is the output of the error:

  [####################################]  100%             
Traceback (most recent call last):
  File "/usr/local/bin/rio", line 11, in <module>
    sys.exit(main_group())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rio_cogeo/scripts/cli.py", line 104, in cogeo
    config,
  File "/usr/local/lib/python2.7/dist-packages/rio_cogeo/cogeo.py", line 96, in cog_translate
    copy(mem, dst_path, copy_src_overviews=True, **dst_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rasterio/env.py", line 402, in wrapper
    return f(*args, **kwds)
  File "rasterio/shutil.pyx", line 112, in rasterio.shutil.copy
  File "rasterio/_err.pyx", line 188, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_NotSupportedError: Currently, PHOTOMETRIC=YCBCR requires COMPRESS=JPEG

allow non-integer nodata value

rio cogeo lopegeo_mlk_005.tif lopegeo_mlk_005_cogeo.tif --cog-profile deflate --nodata 0.5
Usage: rio cogeo [OPTIONS] INPUT OUTPUT
Try "rio cogeo --help" for help.

Error: Invalid value for "--nodata": 0.5 is not a valid integer

Get off Circle CI

It's not economical for our org. I'm going to kill off the circle YAML now and we'll set up Travis in a separate PR.

rio cogeo create fails when converting sentinel-2 preview files

In all sentinel-2 directories, there is a PVI (preview) file in jp2 format that throws a blocksize exceeds raster width error when you try to convert using rio cogeo create [file].

gdalinfo confirms that the blocksize is 192x192 while the band dimension is just 171x171. So the issue seems to be more of a corner case of really small (tiled) jp2s rather than a bug with rio cogeo.

If in SAFE format, the approximate location of this file is:
$(PRODUCT_NAME).SAFE/GRANULE/$(IMAGE_GRANULE)/QI_DATA/TILE_ID..._PVI.jp2

Here's a command to pull a random preview file out of the sentinel s3:
aws s3 cp s3://sentinel-s2-l1c/tiles/7/J/FL/2019/2/12/0/preview.jp2 . --request-payer requester

As you can see, rio cogeo create preview.jp2 preview.tif fails.

It looks like this generic error comes when instantiating the DatasetWriterBase class. Is there a way to keep current protections but allow for cog creation on even really small preview files?

Thanks in advance for any suggestions or advice :)

Rasterio fails when using ZSTD compression

I've got a strange behavior when trying to create a cog with ZSTD compression resulting in python process to exit.

rio cogeo raw.tif cog_zstd.tif -p zstd
  [####################################]  100%
Killed

setting CPL_DEBUG=ON doesn't give more info.

After few investigation, it seems to be a memory error*, but I can't find the right config to make it pass.

Note:
If I remove copy_src_overviews=True in https://github.com/mapbox/rio-cogeo/blob/7c5b1893a47719cd6ea50d21c272e4b6c04b30af/rio_cogeo/cogeo.py#L100 it doesn't fail but then the file is not a COG with overview.

file used: https://github.com/mapbox/cog_cow_testsuite/blob/master/data/raw.tif

Hangs long after progressbar gets to 100%

  • Using either cog_translate() or the command line, large files (2GB+) take an extremely large time to convert or don't convert at all and stall the current process.
  • There is no indication of progress since it hangs after it hits 100%.
  • Memory on machine and HD space are not maxed out so it's probably related to memory management within python but I'll have to do some more digging.

Settings are as follows:

cog_translate(
src_path=input_filepath,
dst_path=output_filepath,
dst_kwargs=cog_profiles.get('deflate'),
indexes=None,
nodata=None,
alpha=None,
overview_level=6,
overview_resampling='average',
config={'NUM_THREADS': 'ALL_CPUS', 'PREDICTOR': '2'}
)

Note about Opinions!

While rio-cogeo follows the COG specification, it also make some choice. By default we decided to enforce internal tilings (tiling is optional for files that are < 1024x1024) and overview (those are optional).

I personally thinks those two are important but the user still have the possibility to change the default by using options (in CLI).

Let's add a note on the README about the choice made for the user.

release 1.1.0 - July 17th 2019

PR #82 is changing the way rio_cogeo.cogeo.cog_translate handle blocksize for small dataset thus this need a new minor version: 1.1.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.