Code Monkey home page Code Monkey logo

Comments (22)

rouault avatar rouault commented on June 4, 2024 1

It is not obvious there is a GDAL issue to me. @vincentsarago In your gdalwarp tiny.tif tiny_cog.tif please add a -overwrite flag so as to make sure you don't warp into an existing output file that would happen to be Float64. gdalwarp in_float32.tif out.tif -overwrite leads to out.tif being Float32 for me

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024 1

@rsignell-usgs this is a pure rasterio problem. I've opened a ticket over rasterio/rasterio#1744 and hope this get resolved. I'm going to closes this issue but will follow on the rasterio repo

from rio-cogeo.

rsignell-usgs avatar rsignell-usgs commented on June 4, 2024 1

@vincentsarago , yes, in fact I just used gdal_translate as you suggested:

gdal_translate Southern_California_Topobathy_DEM_1m.tif   \
                       Southern_California_Topobathy_DEM_1m_cog.tif   \
                       -co TILED=YES -co COPY_SRC_OVERVIEWS=YES -co COMPRESS=LZW -co BIGTIFF=YES

Followed of course by:

rio cogeo validate Southern_California_Topobathy_DEM_1m_cog.tif

😸

BTW, I didn't know about the super cool service here:
https://remotepixel.ca/cogeo.html?url=https://esip-pangeo-uswest2.s3-us-west-2.amazonaws.com/sciencebase/Southern_California_Topobathy_DEM_1m_cog.tif&rescale=-50,2000&color_map=terrain#12.91/33.75184/-117.57867
Two questions:

  1. Is it using the different overviews as you zoom in?
  2. Where can I find the documentation for this RESTful API (what the parameter options are)?

from rio-cogeo.

DanSchoppe avatar DanSchoppe commented on June 4, 2024

I've experienced the same. I'll dig up the details of my input file as well.

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

👋 @rsignell-usgs @DanSchoppe
I wasn't able to reproduce the issue on my mac with rasterio from wheels (gdal 2.4.1)

can you share your config please ? To be honest this is mostly a rasterio/gdal bug, but I'll dig up

from rio-cogeo.

rsignell-usgs avatar rsignell-usgs commented on June 4, 2024

I've also got gdal=2.4.1, but I'm on linux, and I installed rio-cogeo=1.1.0 using conda install -c conda-forge rio-cogeo into an existing conda environment which sometimes can be problematic. I'll try creating a new environment and see if that helps.

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

not sure conda has anything to do with that, maybe linux 🤔

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

@rsignell-usgs could you check what happens when you do `gdal_translate Southern_California_Topobathy_DEM_1m.tif Southern_California_Topobathy_DEM_1m_lzw.tif -co COMPRESS=LZW"

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

I'm pretty sure the change appears here

copy(tmp_dst, dst_path, copy_src_overviews=True, **dst_kwargs)
but I've not been able to re-reproduce the error.
@rsignell-usgs could you share the data ?

from rio-cogeo.

rsignell-usgs avatar rsignell-usgs commented on June 4, 2024

The overviews are definitely different:

Source:

Overviews: 79030x56305, 39515x28153, 19758x14077, 9879x7039, 4940x3520, 2470x1760, 1235x880, 618x440, 309x220

Destination:

  Overviews: 158060x112609, 79030x56305, 39515x28153, 19758x14077, 9879x7039, 4940x3520, 2470x1760, 1235x880, 618x440

I'm guessing this is why rio-cogeo is still crunching away. The source file is 230GB, and the destination is currently 600GB and still growing.

I can share the data, but perhaps I should cut a smaller piece out first! 😸

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

The source file is 230GB, and the destination is currently 600GB and still growing.
😱

@rsignell-usgs

The overviews are definitely different

The overviews are different because the overview level is calculated from the internal tile size (by default 512) and the size of your raster.

If you have already overviews for your input file you don't need to use rio-cogeo but should use GDAL directly

gdal_translate in.tif out.tif -co TILED=YES -co COPY_SRC_OVERVIEWS=YES -co COMPRESS=LZW

Because rio-cogeo is going to reprocess the overviews, the process is going time consumming.

On another note, can I ask you for what purpose are you creating such a huge COG ?

Sadly the schema we use for COG creation is not ideal for such a big raster. @rouault has made a nice COG driver (will be shipped in gdal 3.1 I think) which will work best with a file like yours : https://gdal.org/drivers/raster/cog.html

If you can't wait for gdal3.1 I'll definitely cut this raster in pieces and if it's for dynamic tiling purpose maybe have a look at: https://github.com/cogeotiff/rio-tiler-mosaic

from rio-cogeo.

rsignell-usgs avatar rsignell-usgs commented on June 4, 2024

I'm interested in this huge COG because I want to show how effective it is to access huge COG from S3 using rasterio with dask on AWS. Basically use a workflow like the one in this excellent blog post by @scottyhq.

Unfortunately I'm out the door for a week of vacation, but will pick this up when I return. Thanks for the help!

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

I'm interested in this huge COG because I want to show how effective it is to access huge COG from S3 using rasterio with dask on AWS. Basically use a workflow like the one in this excellent blog post by @scottyhq.

Well because the header of your file will be enormous (> 16ko) I guess it won't be as performant as it should (this is why the new COG driver was created). It will still work but every time dask will try to read some data it will do multiple S3 get_range calls to be able to know where to read within the raster.

Because we couldn't wait for GDAL3.1 (and we didn't know about the COG driver at the time) we created https://github.com/developmentseed/mosaicjson-spec and https://github.com/cogeotiff/rio-tiler-mosaic to be able to easily create a virtual mosaic of COGs... but maybe this is more adapted to dynamic tilling.

from rio-cogeo.

rsignell-usgs avatar rsignell-usgs commented on June 4, 2024

Oh. Well maybe I’ll try Zarr then!

from rio-cogeo.

rsignell-usgs avatar rsignell-usgs commented on June 4, 2024

I've snipped out a tiny piece of my input file and made it available here:
https://esip-pangeo-uswest2.s3-us-west-2.amazonaws.com/sciencebase/tiny.tif

If I convert it using:

rio cogeo create -p lzw tiny.tif output.tif

the data gets promoted from Float32 to Float64

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

thanks @rsignell-usgs I'll have a look

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

@rsignell-usgs
I was able to recreate the issue and I think this is on GDAL side

$ gdalinfo tiny.tif -json | jq -r '.bands[0].type'
Float32

$ gdalwarp tiny.tif tiny_cog.tif
Processing tiny.tif [1/1] : 0Using internal nodata values (e.g. -3.40282e+38) for image tiny.tif.
...10...20...30...40...50...60...70...80...90...100 - done.

$ gdalinfo tiny_cog.tif -json | jq -r '.bands[0].type'
Float64

This only happens when using Warp (or WarpedVRT in rasterio). We may want to open an issue on GDAL

If I try to force the output file to be in Float32 (as the input file) rasterio errors with ValueError: the array's dtype 'float64' does not match the file's dtype 'float32' because the WarpedVRT creates a float64 array

🤔

from rio-cogeo.

rsignell-usgs avatar rsignell-usgs commented on June 4, 2024

@vincentsarago , where should I raise this GDAL issue? (I'm back from vacation)

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

Thanks @rouault, you are right, using -overwrite lead to a file with Float32.

I guess this is more on the rasterio side then. I'll try to dig deeper.

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

@rsignell-usgs I've seen that you succeed in you COG creation (ref: https://nbviewer.jupyter.org/gist/rsignell-usgs/0f96bb9c0ca34a5dd0fc8131a7bbae1c) I wonder if you ended up using pure GDAL commands ?

BTW the data looks great https://remotepixel.ca/cogeo.html?url=https://esip-pangeo-uswest2.s3-us-west-2.amazonaws.com/sciencebase/Southern_California_Topobathy_DEM_1m_cog.tif&rescale=-50,2000&color_map=terrain#12.91/33.75184/-117.57867

from rio-cogeo.

sgillies avatar sgillies commented on June 4, 2024

@vincentsarago @rouault I think I found the root over in rasterio/rasterio#1744 (comment).

from rio-cogeo.

vincentsarago avatar vincentsarago commented on June 4, 2024

BTW, I didn't know about the super cool service here:
https://remotepixel.ca/cogeo.html?url=https://esip-pangeo-uswest2.s3-us-west-2.amazonaws.com/sciencebase/Southern_California_Topobathy_DEM_1m_cog.tif&rescale=-50,2000&color_map=terrain#12.91/33.75184/-117.57867
Two questions:

Is it using the different overlays as you zoom in?

Yes, this use https://github.com/cogeotiff/rio-tiler, and will fetch the correct overview (I think when you say overlays you mean internal overview pyramid).

Where can I find the documentation for this RESTful API (what the parameter options are)?

The remotepixel tiler is opensource and code is over https://github.com/RemotePixel/remotepixel-tiler but the part to read any COG is from https://github.com/vincentsarago/lambda-tiler

lambda-tiler provide automatic documentation like https://cogeo.remotepixel.ca/docs

Note: RemotePixel is a personal project so it's not really made to be used as a Production endpoint. Deploying your own endpoint is pretty easy using https://github.com/vincentsarago/lambda-tiler

from rio-cogeo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.