cogeotiff / rio-cogeo Goto Github PK

View Code? Open in Web Editor NEW

293.0 43.0 38.0 40.8 MB

Cloud Optimized GeoTIFF creation and validation plugin for rasterio

Home Page: https://cogeotiff.github.io/rio-cogeo/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

satellite cog geotiff rasterio cogeotiff

rio-cogeo's Introduction

rio-cogeo

Cloud Optimized GeoTIFF (COG) creation and validation plugin for Rasterio.

Documentation: https://cogeotiff.github.io/rio-cogeo/

Source Code: https://github.com/cogeotiff/rio-cogeo

Cloud Optimized GeoTIFF

This plugin aims to facilitate the creation and validation of Cloud Optimized GeoTIFF (COG or COGEO). While it respects the COG specifications, this plugin also enforces several features:

Internal overviews (User can remove overview with option --overview-level 0)
Internal tiles (default profiles have 512x512 internal tiles)

Important: in GDAL 3.1 a new COG driver has been added (doc, discussion), starting with rio-cogeo version 2.2, --use-cog-driver option was added to create COG using the COG driver.

Install

$ pip install -U pip
$ pip install rio-cogeo

Or install from source:

$ pip install -U pip
$ pip install git+https://github.com/cogeotiff/rio-cogeo.git

GDAL Version

It is recommended to use GDAL > 2.3.2. Previous versions might not be able to create proper COGs (ref: OSGeo/gdal#754).

More info in #55

Blog post on good and bad COG formats: https://medium.com/@_VincentS_/do-you-really-want-people-using-your-data-ec94cd94dc3f

Checkout rio-glui or rio-viz rasterio plugins to explore COG locally in your web browser.

Contribution & Development

See CONTRIBUTING.md

Changes

See CHANGES.md.

License

See LICENSE

rio-cogeo's People

Contributors

Stargazers

Watchers

rio-cogeo's Issues

Float32 data promoted to Float64

I used rio-cogeo to convert a geotiff with 1 band of Type=Float32 geotiff. To my surprise, the output generated was Type=Float64 geotiff. Any idea why that happened and how to keep the data as Type=Float32?

Here's the actual rio-cogeo command I used:

rio cogeo create --cog-profile lzw Southern_California_Topobathy_DEM_1m.tif Southern_California_Topobathy_DEM_1m_COG.tif

And here's the gdalinfo for the INPUT file:

Driver: GTiff/GeoTIFF
Files: Southern_California_Topobathy_DEM_1m.tif
       Southern_California_Topobathy_DEM_1m.tif.ovr
       Southern_California_Topobathy_DEM_1m.tif.aux.xml
Size is 316120, 225217
Coordinate System is:
PROJCS["NAD_1983_NSRS2007_UTM_Zone_11N",
    GEOGCS["GCS_NAD_1983_NSRS2007",
        DATUM["NAD_1983_NSRS2007",
            SPHEROID["GRS_1980",6378137,298.257222101]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",-117],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]]]
Origin = (179523.999999998224666,3824832.000000000000000)
Pixel Size = (1.000000000000000,-1.000000000000000)
Metadata:
  AREA_OR_POINT=Area
  DataType=Generic
Image Structure Metadata:
  COMPRESSION=LZW
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  179524.000, 3824832.000) (120d29'26.74"W, 34d30'55.16"N)
Lower Left  (  179524.000, 3599615.000) (120d24'36.73"W, 32d29'15.49"N)
Upper Right (  495644.000, 3824832.000) (117d 2'50.95"W, 34d33'54.85"N)
Lower Right (  495644.000, 3599615.000) (117d 2'47.00"W, 32d32' 1.94"N)
Center      (  337584.000, 3712223.500) (118d44'57.15"W, 33d32'14.24"N)
Band 1 Block=128x128 Type=Float32, ColorInterp=Gray
  Min=-3857.312 Max=3067.828
  Minimum=-3857.312, Maximum=3067.828, Mean=-353.493, StdDev=804.656
  NoData Value=-3.40282306073709653e+38
  Overviews: 79030x56305, 39515x28153, 19758x14077, 9879x7039, 4940x3520, 2470x1760, 1235x880, 618x440, 309x220
  Metadata:
    RepresentationType=ATHEMATIC
    STATISTICS_COVARIANCES=647471.7255957157
    STATISTICS_MAXIMUM=3067.8276367188
    STATISTICS_MEAN=-353.49345874696
    STATISTICS_MINIMUM=-3857.3120117188
    STATISTICS_SKIPFACTORX=1
    STATISTICS_SKIPFACTORY=1
    STATISTICS_STDDEV=804.65627791978

And the gdalinfo for the OUTPUT file:

 Driver: GTiff/GeoTIFF
Files: tmp2cd_5blz.tif
Size is 316120, 225217
Coordinate System is:
PROJCS["NAD_1983_NSRS2007_UTM_Zone_11N",
    GEOGCS["GCS_NAD_1983_NSRS2007",
        DATUM["NAD_1983_NSRS2007",
            SPHEROID["GRS_1980",6378137,298.257222101]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",-117],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]]]
Origin = (179523.999999998224666,3824832.000000000000000)
Pixel Size = (1.000000000000000,-1.000000000000000)
Metadata:
  AREA_OR_POINT=Area
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  179524.000, 3824832.000) (120d29'26.74"W, 34d30'55.16"N)
Lower Left  (  179524.000, 3599615.000) (120d24'36.73"W, 32d29'15.49"N)
Upper Right (  495644.000, 3824832.000) (117d 2'50.95"W, 34d33'54.85"N)
Lower Right (  495644.000, 3599615.000) (117d 2'47.00"W, 32d32' 1.94"N)
Center      (  337584.000, 3712223.500) (118d44'57.15"W, 33d32'14.24"N)
Band 1 Block=512x512 Type=Float64, ColorInterp=Gray
  NoData Value=-3.40282306073709653e+38
  Overviews: 158060x112609, 79030x56305, 39515x28153, 19758x14077, 9879x7039, 4940x3520, 2470x1760, 1235x880, 618x440

using `cog_translate` on netCDF4 with improper georeferencing

I am trying to use cog_translate to convert some netCDF4 datasets to COG, and it appears these files do not have proper georeferencing.

I have modified cogeo.py to accept vrt_params, so that I can manually pass the src_crs and src_transform via the WarpedVRTReaderBase mixin of WarpedVRT, as you can see here:

https://github.com/ryanjdillon/rio-cogeo/blob/cog_translate_vrt_params/rio_cogeo/cogeo.py#L87-L99

Then I'm passing the following vrt_params:

vrt_params = dict(src_transform=Affine(gdal_geotransform), src_crs='EPSG:4326')

This get's me past the CRS is None error, but it doesn't seem to acknowledge the Affine transform I pass to src_transform, resulting in the following error:

CPLE_AppDefinedError: The transformation is already "north up" or a transformation between pixel/line and georeferenced coordinates cannot be computed for /home/ryan/data/webstep/kvt/NVE_Sorlandet/NVE_Sorlandet_wslev0_xyt_synth.nc4. There is no affine transformation and no GCPs. Specify transformation option SRC_METHOD=NO_GEOTRANSFORM to bypass this check.

Any suggestions on how I might get this to work?

create conda package

ref: CosmiQ/solaris#163 (comment)

release to Pypi are failing !

both 1.1.1 and 1.1.2 are still not on pypi because Circle CI seems to get stuck somewhere in the process

ZSTD missing codec

Thanks for this lib/plugin, it's really helpful and avoids having to guess which creation options are good for remote access.

I'm encountering a problem with the zstd cog-profile.

When running:

rio cogeo create -p zstd S2A_MSIL1C_20190112T175711_N0207_R141_T13UFQ_20190112T194934__B01.jp2 out.tiff

I get the following traceback:

Reading input: /home/guillaumelostis/code/s2-cog-benchmark/tiles/S2A_MSIL1C_20190112T175711_N0207_R141_T13UFQ_20190112T194934__B01.jp2
Adding overviews...
Updating dataset tags...
Writing output to: /home/guillaumelostis/code/s2-cog-benchmark/tiles/tiff/zstd.tiff
WARNING:rasterio._env:CPLE_NotSupported in 'ZSTD' is an unexpected value for COMPRESS creation option of type string-select.
Traceback (most recent call last):
  File "/home/guillaumelostis/.venv/bench/bin/rio", line 11, in <module>
    sys.exit(main_group())
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/rio_cogeo/scripts/cli.py", line 191, in create
    quiet=quiet,
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/rio_cogeo/cogeo.py", line 268, in cog_translate
    copy(tmp_dst, dst_path, copy_src_overviews=True, **dst_kwargs)
  File "/home/guillaumelostis/.venv/bench/lib/python3.6/site-packages/rasterio/env.py", line 445, in wrapper
    return f(*args, **kwds)
  File "rasterio/shutil.pyx", line 139, in rasterio.shutil.copy
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_AppDefinedError: Cannot create TIFF file due to missing codec for ZSTD.

I run this in a fresh virtual environment where I have pip install rio-cogeo.
Is the problem that the GDAL lib that came packaged with the rasterio wheel doesn't support ZSTD? That seems surprising. For reference, the rasterio wheel that was installed is https://files.pythonhosted.org/packages/e1/2e/af9bfa901b890800b25428ab9846fe4e91818a94ede227c1a2d4410bac54/rasterio-1.0.28-cp36-cp36m-manylinux1_x86_64.whl

Or is something wrong with my local setup?

Add `--web` option to create ZXY tiles friendly file

We should add an option to warp and align a file to the webmercator grid!

Add warning about the new GDAL COG creation driver

ref: https://gdal.org/drivers/raster/cog.html

Rasterio doesn't support GDAL > 2.4 so we don't need to rush working on this right now.

add a note about the need of GDAL >= 2.3.2 to create proper COG with internal mask

There seems to be bug for mask creation with old gdal version
ref: OSGeo/gdal#754

add logging for creation steps ?

mentioned in #46 we have a progressbar only for the first step of the cog creation which is usually the fastest steps. The seconds steps (overview and compression) can hangs for a long time (especially when working with complexes compression).
IMO we could add some logging to tell the user what's going on:

actual

$ rio cogeo 2003031.tif test.tif
  [####################################]  100%

future

$ rio cogeo 2003031.tif test.tif
---------------------
Input file: 2003031.tif
Output file:  test.tif
Profile: YCBCR (JPEG)
Options: ...
---------------------
Creating internal tiles...
  [####################################]  100%
Adding overviews...
Writing output file...

keep 128 as default GDAL_TIFF_OVR_BLOCKSIZE

While working on the web-optimized (ref: #62) option I realized I made a bad choice from the beginning. in

rio-cogeo/rio_cogeo/scripts/cli.py

Line 122 in 983b709

GDAL_TIFF_OVR_BLOCKSIZE=os.environ.get("GDAL_TIFF_OVR_BLOCKSIZE", block_size),

we set GDAL_TIFF_OVR_BLOCKSIZE to be the same as the internal tile for the high resolution data. This is usually ok, but in a case of web-optimized COG, it will result in GDAL having to fetch more tiles than needed.

👇 here we created a COG with internal tiles (256px for high resolution and overviews) aligned with web mercator grid (left). But the right image show that the overview (level 1) internal tiles are not aligned with the mercator grid at zoom - 1 (this is in fact normal).

Having internal tiles not aligned for overview is normal, but if we look at the figure 👇, if we try to fetch the data for mercator tile 17-118594-60034, GDAL will fetch 2 256x256px internal tiles (0,0 and 1,0) while the mercator tile only cover 1/4 of the area for those tiles.

Because GDAL can merge http call for adjacent range requests, having different internal tiles size for overviews will not create any harms and should speed up the performance.

⚠️
I'm inclined to update the CLI and add a --overview-tilesize option with default to GDAL_TIFF_OVR_BLOCKSIZE env and a fallback to 128.

This will be a breaking change and should happen before 1.0.0

non-ascii character in README prevents installation

Line 227 of the README reads:
⚠️ GDAL>=3 is not yet supported by rasterio

Installation fails due to the non-ascii character:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 7166: ordinal not in range(128)

How to convert geotiffs programmatically to cogs

After I have verified the successful run of the command line rio cogeo example.tif example_cog_raw.tif --cog-profile raw then I'm trying to use the function cog_translate programmatically.

This snippet is failing:

cog_img = "cog" + "_" + name # name is for instance example.tif
cog_profile = cog_profiles.get('raw')
cog_profile.update(dict(BIGTIFF=os.environ.get("BIGTIFF", "IF_SAFER")))
block_size = 512
config = dict(
            NUM_THREADS=8,
            GDAL_TIFF_INTERNAL_MASK=os.environ.get("GDAL_TIFF_INTERNAL_MASK", True),
            GDAL_TIFF_OVR_BLOCKSIZE=os.environ.get("GDAL_TIFF_OVR_BLOCKSIZE", block_size),
)
cog_translate(
            img,
            cog_img,
            cog_profile,
            None,
            None,
            None,
            6,
            config
)

where img the binary image file which is properly opened previously by rasterio and checked for being or not a COG.
The error stack trace is:

File "/Users/geobart/Development/CallForCode/cog-k8s/app/api_views.py", line 87, in post
    block_size = 512
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rio_cogeo/cogeo.py", line 52, in cog_translate
    with rasterio.open(src_path) as src:
  File "/Users/geobart/.pyenv/versions/3.6.2/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rasterio/__init__.py", line 178, in fp_reader
    dataset = memfile.open()
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rasterio/env.py", line 360, in wrapper
    return f(*args, **kwds)
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rasterio/io.py", line 132, in open
    writer = get_writer_for_driver(driver)
  File "/Users/geobart/pyenv/cog-k8s-2pcfZKQH/lib/python3.6/site-packages/rasterio/io.py", line 179, in get_writer_for_driver
    if driver_can_create(driver):
  File "rasterio/_base.pyx", line 112, in rasterio._base.driver_can_create
  File "rasterio/_base.pyx", line 95, in rasterio._base.driver_supports_mode
AttributeError: 'NoneType' object has no attribute 'encode'

Do you have any hints on the cause of the error and if I'm using cogeo in a wrong way?

Remove useless Boundless=True options to avoid rasterio warning

in https://github.com/mapbox/rio-cogeo/blob/186a7208f0d5ef0c6cfbd87ae6e7d2047acc6076/rio_cogeo/cogeo.py#L72-L85 we are using boundless=True option when reading the dataset but it is not needed because windows shouldn't be boundless.

rio cogeo raw.tif cog_ycbcr.tif -p ycbcr'
  [------------------------------------]    1%WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
  [#-----------------------------------]    3%WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
WARNING:rasterio._io:Nonzero values in mask have been converted to 255, see note in rasterio/_io.pyx, read_masks()
...

Update get_maximum_overview_level to use Rasterio dataset directly

rio-cogeo/rio_cogeo/cogeo.py

Lines 62 to 66 in fb9ceba

    
           if overview_level is None: 
        
               overview_level = get_maximum_overview_level( 
        
                   src_path, min(int(dst_kwargs["blockxsize"]), int(dst_kwargs["blockysize"])) 
        
               )

we pass the dataset path and then open it in get_maximum_overview_level function. This means we have to open the dataset twice while it's not really necessary

copy dataset to create valid COG

This is not a bug report, rather a question on the implementation:

When the COG is written, the temporary file gets copied to the defined output in the end:

rio-cogeo/rio_cogeo/cogeo.py

Line 262 in a6e8a2b

copy(tmp_dst, dst_path, copy_src_overviews=True, **dst_kwargs)

Is there a specific reason related to the validity of the resulting COG containing overviews? The reason I ask is because I tried to implement a small script which creates COGs (before I discovered this tool) but the output files were not valid COGs as soon as they had overviews.

To me it seems the only difference between the implementations is that here the output is written into a temporary file, then the overviews are generated and finally the temporary file is copied to disk. This seems to be a crucial step for a valid COG as far as I understand, right?

Remove default bidx option and take all dataset bands by default (if possible)

Right now, bidx option is set to be 1,2,3 which is fine when dataset is rgb or rgba. When working on 1 band data the cli will raise an error:

  File "rasterio/_io.pyx", line 233, in rasterio._io.DatasetReaderBase.read
IndexError: band index 2 out of range (not in (1,))

Solution:

remove bidx default and take src dataset index as default.

☝️ that's said we should filter to make sure we don't ingest alpha band (because we have a mask in the output dataset)

Min/Max values of raster changed after converting to COG

Converted a DEM file with Min: -24.4595 and Max: 24.7207

After converting the raster using rio cogeo create, Min:-28.9916 and Max: 25.2354

do not copy alpha band

When a source has an alpha band we should be able to detect it and not copy it to the output file because it will be replaced by the internal mask.

back up circle-ci

let move back to circle-ci
ref: #43

Make internal mask an option ?

Right now we are creating a internal bit mask from alpha/nodata/mask by default, but there is multiple problem with this approach.
https://github.com/mapbox/rio-cogeo/blob/eabf1203be691c2937d89e08a6e62349b594eca1/rio_cogeo/cogeo.py#L75-L86

Well we tend to create the simplest COG (author opinion here) rio-cogeo should maybe offer more flexibility especially for compression like WEBP which can handle RGBA data without needing internal mask.

Maybe the first step is to work on the documentation side of rio-cogeo to explain what are the pro/cons to internal mask and also define good habits for COG format

Incorrect context behavior

why do you add the DatasetReader to the ExitStack, too? Just for cleanup, so it __exit__s as soon as possible? Couldn't that cause trouble if the user does like you envisioned and the DatasetReader exits twice?:

with MemoryFile() as memfile:
    with memfile.open(**src_profile) as mem:
         mem.write(data)
         cog_translate(mem, *args, **kwargs)

Originally posted by @j08lue in #93

Add note to help user add statistics to the output dataset

ref #19

import rasterio

with rasterio.open("my-data.tif", "r+") as src_dst:
    for b in src_dst.indexes:
        band = src_dst.read(indexes=b, masked=masked)
        stats = {
            'min': float(band.min()),
            'max': float(band.max()),
            'mean': float(band.mean())
            'stddev': float(band.std())
        }
        src_dst.update_tags(b, **stats)

Add resampling option for overviews

Right now we use nearest resampling to create the overviews but this could be set by the user
https://github.com/mapbox/rio-cogeo/blob/d05e26b7b9e5bfe5e79ef95e3ab56c82a2e07eb7/rio_cogeo/cogeo.py#L88-L91

COGs built from VRT don't validate

After building a COG from a vrt:

rio cogeo create my.vrt my.tif -p jpeg

the COG doesn't validate:

> rio cogeo validate /vsis3/bucket/my.tif
The following errors were found:
- The offset of the IFD for overview of index 1 is 884184, whereas it should be greater than the one of index 0, which is at byte 3167886
- The offset of the IFD for overview of index 2 is 299436, whereas it should be greater than the one of index 1, which is at byte 884184
- The offset of the IFD for overview of index 3 is 149834, whereas it should be greater than the one of index 2, which is at byte 299436
- The offset of the IFD for overview of index 4 is 107736, whereas it should be greater than the one of index 3, which is at byte 149834
- The offset of the IFD for overview of index 5 is 94524, whereas it should be greater than the one of index 4, which is at byte 107736
- The offset of the first block of the smallest overview should be after its IFD
/vsis3/bucket/my.tif is NOT a valid cloud optimized GeoTIFF

The VRT points to a group of 256px tiffs stored in S3.

Switch from JPEG to DEFLATE for default profile

JPEG profile only works with RGB byte datasets while Deflate compression will always works.

1.0.0 release

Target date: 2019-03-29.

🎉 This is almost ready. we don't have anything to add for the 1.0 release.

I'm targeting Friday 29th for the official release to see if we get feedback on the latest beta release

rio-cogeo/CHANGES.txt

Lines 1 to 6 in 5974cd5

    
           1.0b2 (2019-03-27) 
        
           ------------------ 
        
           Breacking Changes: 
        
           - Switch from JPEG to DEFLATE as default profile in CLI (#66)

release 1.0b0

With #6 merged we now have a tool to create and validate CloudOptimized Geotiff.

I don't have any breaking change in mind, and the only PR blocked #62 will have to wait after the 1.0.0 release (If we get some help).

I'm planning to do a 1.0b0 release on Thursday 14th.

Here is the list of changes since last release:

rio-cogeo/CHANGES.txt

Lines 1 to 16 in a6b76c7

    
           Next (TBD) 
        
           ---------- 
        
           - add more logging and `--quiet` option (#46) 
        
           - add `--overview-blocksize` to set overview's internal tile size (#60) 
        
           Bug fixes: 
        
           - copy tags and description from input to output (#19) 
        
           - copy input mask band to output mask 
        
           Breacking Changes: 
        
           - rio cogeo now has subcommands: 'create' and 'validate' (#6). 
        
           - internal mask creation is now optional (--add-mask). 
        
           - internal nodata or alpha channel can be forwarded to the output dataset. 
        
           - removed default overview blocksize to be equal to the raw data blocksize (#60)

Vincent

Rename `YCbCR` profile to `JPEG`

https://github.com/mapbox/rio-cogeo/blob/dc48a99b23fa1c538cb8b81747c5217fc1f996b7/rio_cogeo/profiles.py#L6

I think using jpeg instead of ycbcr will make more sense for a profile name

cc @perrygeo @sgillies

Set up Travis CI

We're switching from Circle CI. Following up on #42.

Tags are dropped

https://mojodna-temp.s3.amazonaws.com/internal-mask.tif, when processed using rio cogeo, loses TIFFTAG_SOFTWARE=pix4dmapper metadata.

Error when passing options block size and no overview-level passed

rio cogeo lopegeo_mlk_005.tif lopegeo_mlk_005_cogeo.tif --cog-profile deflate --co BLOCKXSIZE=256 --co BLOCKYSIZE=256
Traceback (most recent call last):
  File "/Users/vincentsarago/Workspace/venv/py37/bin/rio", line 11, in <module>
    load_entry_point('rasterio', 'console_scripts', 'rio')()
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/vincentsarago/Workspace/venv/py37/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/Users/vincentsarago/Workspace/CogeoTiff/rio-cogeo/rio_cogeo/scripts/cli.py", line 108, in cogeo
    config,
  File "/Users/vincentsarago/Workspace/CogeoTiff/rio-cogeo/rio_cogeo/cogeo.py", line 58, in cog_translate
    src_path, min(dst_kwargs["blockxsize"], dst_kwargs["blockysize"])
  File "/Users/vincentsarago/Workspace/CogeoTiff/rio-cogeo/rio_cogeo/utils.py", line 15, in get_maximum_overview_level
    while min(width // overview, height // overview) > minsize:
TypeError: '>' not supported between instances of 'int' and 'str'

change band interleave to pixel interleave for packbits profile

PIXEL interleaving in COG saves some HTTP call when accessing tile on cloud. Band interleave will still be possible by passing --co INTERLEAVE=BAND option via the CLI

zstd?

Worth adding a ZSTD profile? Is supported in recent GDAL

https://lists.osgeo.org/pipermail/cog/2018-June/000034.html

datatype option

following #85, a fix has been pushed to rasterio so now we need to add an option or a default behaviour to rio-cogeo to handle datatype inheritance...

ref rasterio/rasterio#1768

Configurable minimum internal tile size for valid COG

The COG specification does not fix a hard requirement about the internal tiling size but the validation script does.

The internal tile size is important for COG because it will determine if the COG might need overviews or not (but they are optional).

IMO the internal tile size should be configurable in the validate script because some user could consider a 1024x1024 file without internal tiling nor internal overview as proper cog while the script will raise about not having internal tiling and warns about not having overviews.

when using jpeg profile silently creates empty output if bit depth != 8

Vrersion:
$ pip freeze | grep rio
rasterio==1.0.22
rio-cogeo==1.0b3

Problem: When creating from a unsigned int 16 bit per pixel tif a jpeg profile / jpeg compressed cogeo, successfully creates an empty file.

What I expected: an error message indicating this can only work for 8 bits per pixel.

Details:

When doing the same thing with gdal directly I get warnings that JPEG compression is not compatible with 16 bit pixel depth (using UINT16 Tiff bands), but rio cogeo silently swallows these warnings:

gdal_translate temp.tif gdal_cog.tif -co TILED=YES -co COMPRESS=JPEG -co PHOTOMETRIC=YCBCR -co COPY_SRC_OVERVIEWS=YES
Input file size is 4171, 4061
0ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
..ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
.10...20.ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
..30...40...50...60...70...80...90...100 - done.
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG
ERROR 1: WriteEncodedTile/Strip() failed.
ERROR 1: JPEGSetupEncode:BitsPerSample 16 not allowed for JPEG

 $ rio cogeo create --cog-profile jpeg temp.tif cog_jpeg.tif
Reading input: /home/hiro/git/tilingtest/temp.tif
  [####################################]  100%
Adding overviews...
Updating dataset tags...
Writing output to: /home/hiro/git/tilingtest/cog_jpeg.tif
$ ls -l cog_jpeg.tif 
-rw-------. 1 hiro hiro 4992 Mar 31 13:02 cog_jpeg.tif
$ ls -l temp.tif
-rw-------. 1 hiro hiro 138236380 Mar 31 12:32 temp.tif
$ rio info temp.tif
{"bounds": [694696.0, 3674712.0, 736406.0, 3715322.0], "colorinterp": ["red", "green", "blue"], "count": 3, "crs": "EPSG:32636", "descriptions": [null, null, null], "driver": "GTiff", "dtype": "uint16", "height": 4061, "indexes": [1, 2, 3], "interleave": "pixel", "lnglat": [35.316978660693565, 33.37281747525934], "mask_flags": [["all_valid"], ["all_valid"], ["all_valid"]], "nodata": null, "res": [10.0, 10.0], "shape": [4061, 4171], "tiled": false, "transform": [10.0, 0.0, 694696.0, 0.0, -10.0, 3715322.0, 0.0, 0.0, 1.0], "units": [null, null, null], "width": 4171}

expose array+profile API

I often want to write data from numpy arrays as a cloud-optimized GeoTIFF, together with the usual rasterio profile dictionary (transform, crs, width, height, dtype, etc.). It would be super nice to have a Python API for this.

This would require splitting the large cogeo.cog_translate function into two at least, roughly here:

rio-cogeo/rio_cogeo/cogeo.py

Line 226 in 4c6e890

matrix = vrt_dst.read(window=w, indexes=indexes)

or, if that becomes too complicated, create a separate function for this, like

def cog_write(
    src_data,
    src_profile,
    dst_path,
    nodata=None,
    add_mask=None,
    overview_level=None,
    overview_resampling="nearest",
    web_optimized=False,
    latitude_adjustment=True,
    resampling="nearest",
    in_memory=None,
    config=None,
    quiet=False,
):

What do you think? Would this make sense to add to this package? Or will this package anyways be superseded by the new COG driver in GDAL?

[BUG] updating tags after creation breaks COG specification

in 8b24785 we introduced a bug. Re-openning the dataset with r+ to add rio_overview tag value seems to mess-up with the directory

gdal validate_cloud_optimized_geotiff.py then gives:

python validate_cloud_optimized_geotiff.py cogeo.tif
cogeo.tif is NOT a valid cloud optimized GeoTIFF.
The following errors were found:
 - The offset of the main IFD should be 8 for ClassicTIFF or 16 for BigTIFF. It is 7969940 instead
 - The offset of the IFD for overview of index 0 is 3380, whereas it should be greater than the one of the main image, which is at byte 7969940

cc @sgillies @perrygeo

Support rasters which does not fit in memory

As I can see from source code rio-cogeo uses MemoryFile. Does it mean that this plugin is not suit for cases when source raster is not fit in memory?

requires COMPRESS=JPEG with COMPRESS=LZW option

I am working with the following file and trying to convert it to cloud-optimized geotiff.

https://s3.amazonaws.com/share-terravion-com/2018_cog/S2A_MSIL2A_20171206T171701_N0206_R112_T15TWK_20171206T190908_MULTIBAND.tif

The following is the command:
rio cogeo S2A_MSIL2A_20171206T171701_N0206_R112_T15TWK_20171206T190908_MULTIBAND.tif S2A_MSIL2A_20171206T171701_N0206_R112_T15TWK_20171206T190908_MULTIBAND_cog.tif --co "COMPRESS=LZW" --co BLOCKXSIZE=256 --co BLOCKYSIZE=256

Here is the output of the error:

  [####################################]  100%             
Traceback (most recent call last):
  File "/usr/local/bin/rio", line 11, in <module>
    sys.exit(main_group())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rio_cogeo/scripts/cli.py", line 104, in cogeo
    config,
  File "/usr/local/lib/python2.7/dist-packages/rio_cogeo/cogeo.py", line 96, in cog_translate
    copy(mem, dst_path, copy_src_overviews=True, **dst_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rasterio/env.py", line 402, in wrapper
    return f(*args, **kwds)
  File "rasterio/shutil.pyx", line 112, in rasterio.shutil.copy
  File "rasterio/_err.pyx", line 188, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_NotSupportedError: Currently, PHOTOMETRIC=YCBCR requires COMPRESS=JPEG

allow non-integer nodata value

rio cogeo lopegeo_mlk_005.tif lopegeo_mlk_005_cogeo.tif --cog-profile deflate --nodata 0.5
Usage: rio cogeo [OPTIONS] INPUT OUTPUT
Try "rio cogeo --help" for help.

Error: Invalid value for "--nodata": 0.5 is not a valid integer

Internal mask might not be a good solution for non-byte dataset

ref:

Basically there are 2 bugs:

alpha band (internal mask is converted to alpha band by rasterio.vrt.WarpedVRT) is not fetched when the datatype is not Byte or Uint16

performance is really slow when using non-byte data and internal masking (vrt)

Get off Circle CI

It's not economical for our org. I'm going to kill off the circle YAML now and we'll set up Travis in a separate PR.

[rio cogeo validate] add option to force overview

Could be useful to have an option to make overviews mandatory

rio cogeo create fails when converting sentinel-2 preview files

In all sentinel-2 directories, there is a PVI (preview) file in jp2 format that throws a blocksize exceeds raster width error when you try to convert using rio cogeo create [file].

gdalinfo confirms that the blocksize is 192x192 while the band dimension is just 171x171. So the issue seems to be more of a corner case of really small (tiled) jp2s rather than a bug with rio cogeo.

If in SAFE format, the approximate location of this file is:
$(PRODUCT_NAME).SAFE/GRANULE/$(IMAGE_GRANULE)/QI_DATA/TILE_ID..._PVI.jp2

Here's a command to pull a random preview file out of the sentinel s3:
aws s3 cp s3://sentinel-s2-l1c/tiles/7/J/FL/2019/2/12/0/preview.jp2 . --request-payer requester

As you can see, rio cogeo create preview.jp2 preview.tif fails.

It looks like this generic error comes when instantiating the DatasetWriterBase class. Is there a way to keep current protections but allow for cog creation on even really small preview files?

Thanks in advance for any suggestions or advice :)

Rasterio fails when using ZSTD compression

I've got a strange behavior when trying to create a cog with ZSTD compression resulting in python process to exit.

rio cogeo raw.tif cog_zstd.tif -p zstd
  [####################################]  100%
Killed

setting CPL_DEBUG=ON doesn't give more info.

After few investigation, it seems to be a memory error*, but I can't find the right config to make it pass.

Note:
If I remove copy_src_overviews=True in https://github.com/mapbox/rio-cogeo/blob/7c5b1893a47719cd6ea50d21c272e4b6c04b30af/rio_cogeo/cogeo.py#L100 it doesn't fail but then the file is not a COG with overview.

file used: https://github.com/mapbox/cog_cow_testsuite/blob/master/data/raw.tif

Hangs long after progressbar gets to 100%

Using either cog_translate() or the command line, large files (2GB+) take an extremely large time to convert or don't convert at all and stall the current process.
There is no indication of progress since it hangs after it hits 100%.
Memory on machine and HD space are not maxed out so it's probably related to memory management within python but I'll have to do some more digging.

Settings are as follows:

cog_translate(
src_path=input_filepath,
dst_path=output_filepath,
dst_kwargs=cog_profiles.get('deflate'),
indexes=None,
nodata=None,
alpha=None,
overview_level=6,
overview_resampling='average',
config={'NUM_THREADS': 'ALL_CPUS', 'PREDICTOR': '2'}
)

Note about Opinions!

While rio-cogeo follows the COG specification, it also make some choice. By default we decided to enforce internal tilings (tiling is optional for files that are < 1024x1024) and overview (those are optional).

I personally thinks those two are important but the user still have the possibility to change the default by using options (in CLI).

Let's add a note on the README about the choice made for the user.

release 1.1.0 - July 17th 2019

PR #82 is changing the way rio_cogeo.cogeo.cog_translate handle blocksize for small dataset thus this need a new minor version: 1.1.0

	if overview_level is None:
	overview_level = get_maximum_overview_level(
	src_path, min(int(dst_kwargs["blockxsize"]), int(dst_kwargs["blockysize"]))
	)

	1.0b2 (2019-03-27)
	------------------

	Breacking Changes:

	- Switch from JPEG to DEFLATE as default profile in CLI (#66)

	Next (TBD)
	----------
	- add more logging and `--quiet` option (#46)
	- add `--overview-blocksize` to set overview's internal tile size (#60)

	Bug fixes:

	- copy tags and description from input to output (#19)
	- copy input mask band to output mask

	Breacking Changes:

	- rio cogeo now has subcommands: 'create' and 'validate' (#6).
	- internal mask creation is now optional (--add-mask).
	- internal nodata or alpha channel can be forwarded to the output dataset.
	- removed default overview blocksize to be equal to the raw data blocksize (#60)

cogeotiff / rio-cogeo Goto Github PK

rio-cogeo's Introduction

rio-cogeo

Cloud Optimized GeoTIFF

Install

GDAL Version

More

Contribution & Development

Changes

License

rio-cogeo's People

Contributors

Stargazers

Watchers

Forkers

rio-cogeo's Issues

Recommend Projects

Recommend Topics

Recommend Org