Code Monkey home page Code Monkey logo

Comments (10)

bretttully avatar bretttully commented on June 2, 2024

It actually looks like the error might be on the save, not on the read. The following fails.

import io
import os
from pathlib import Path

import pyogrio
import pyogrio.raw
from shapely import Polygon

import geopandas as gpd
from geopandas.testing import assert_geodataframe_equal

os.environ["PYOGRIO_USE_ARROW"] = "1"
gpd.options.io_engine = "pyogrio"
gpd.show_versions()

data = gpd.GeoDataFrame(
    [
        {"foo": 1, "bar": "a", "geometry": Polygon([(0, 0), (0, 1), (1, 1)])},
        {"foo": 2, "bar": "b", "geometry": Polygon([(0, 0), (0, 2), (2, 2)])},
        {"foo": 3, "bar": "c", "geometry": Polygon([(0, 0), (0, 3), (3, 3)])},
    ],
    geometry="geometry",
    crs="EPSG:4326",
)

outpath = Path("tmp.gpkg")
if outpath.exists():
    outpath.unlink()
data.to_file(outpath, layer="geometry", driver="GPKG")
assert outpath.exists()
bytestr_from_file = outpath.read_bytes()

with io.BytesIO() as stream:
    data.to_file(stream, layer="geometry", driver="GPKG")
    bytestr = stream.getvalue()
assert bytestr == bytestr_from_file, f"{len(bytestr)=} != {len(bytestr_from_file)=}"

AssertionError: len(bytestr)=0 != len(bytestr_from_file)=98304

from geopandas.

m-richards avatar m-richards commented on June 2, 2024

Thanks @bretttully for the report, this is currently the case - bytesIO can't be written to, see geopandas/pyogrio#249 (and discussion in #2875). We should note this as a difference between fiona and pyogrio that could break people in 1.0

from geopandas.

bretttully avatar bretttully commented on June 2, 2024

Oh, thanks @m-richards -- that would be a fairly large regression for us... We could work around by writing to a temp file and then reading to bytes back in, but that wouldn't be great.

from geopandas.

martinfleis avatar martinfleis commented on June 2, 2024

Thanks @bretttully, this is a good feedback to have! I suppose you're not the only one using BytesIO as intermediate files.

@jorisvandenbossche @brendan-ward @theroggy what is the feasibility of getting this to pyogrio 0.8 before geopandas 1.0 lands?

from geopandas.

brendan-ward avatar brendan-ward commented on June 2, 2024

I've been looking into this based on how it is implemented in Fiona / rasterio and working toward a potential PR. Not sure about the timing because there are some complexities here to work out (GPKG append / add layers to memory stream). Will continue the discussion on the pyogrio side.

from geopandas.

martinfleis avatar martinfleis commented on June 2, 2024

@bretttully can you post the output of geopandas.show_versions() of an environment where this actually works, when using Fiona?

from geopandas.

bretttully avatar bretttully commented on June 2, 2024
SYSTEM INFO
-----------
python     : 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
executable : /opt/conda/bin/python
machine    : Linux-4.14.336-257.562.amzn2.x86_64-x86_64-with-glibc2.35

GEOS, GDAL, PROJ INFO
---------------------
GEOS       : 3.12.1
GEOS lib   : None
GDAL       : 3.8.4
GDAL data dir: /opt/conda/share/gdal
PROJ       : 9.3.1
PROJ data dir: /opt/conda/share/proj

PYTHON DEPENDENCIES
-------------------
geopandas  : 0.14.3
numpy      : 1.26.4
pandas     : 2.2.2
pyproj     : 3.6.1
shapely    : 2.0.4
fiona      : 1.9.6
geoalchemy2: None
geopy      : 2.4.1
matplotlib : 3.8.4
mapclassify: 2.6.1
pygeos     : None
pyogrio    : 0.7.2
psycopg2   : 2.9.9 (dt dec pq3 ext lo64)
pyarrow    : 13.0.0
rtree      : 1.2.0

Code:

import io
from pathlib import Path

import geopandas as gpd
from geopandas.testing import assert_geodataframe_equal
from shapely import Polygon

gpd.show_versions()

data = gpd.GeoDataFrame(
    [
        {"foo": 1, "bar": "a", "geometry": Polygon([(0, 0), (0, 1), (1, 1)])},
        {"foo": 2, "bar": "b", "geometry": Polygon([(0, 0), (0, 2), (2, 2)])},
        {"foo": 3, "bar": "c", "geometry": Polygon([(0, 0), (0, 3), (3, 3)])},
    ],
    geometry="geometry",
    crs="EPSG:4326",
)

outpath = Path("tmp.gpkg")
if outpath.exists():
    outpath.unlink()
data.to_file(outpath, layer="geometry", driver="GPKG")
assert outpath.exists()
bytestr_from_file = outpath.read_bytes()

with io.BytesIO() as stream:
    data.to_file(stream, layer="geometry", driver="GPKG")
    bytestr = stream.getvalue()
assert len(bytestr) == len(bytestr_from_file), f"{len(bytestr)=} != {len(bytestr_from_file)=}"


with io.BytesIO(bytestr) as stream:
    data2 = gpd.read_file(stream, driver="GPKG")
assert_geodataframe_equal(data, data2)

from geopandas.

bretttully avatar bretttully commented on June 2, 2024

Note the change of assert bytestr == bytestr_from_file to assert len(bytestr) == len(bytestr_from_file) -- I forgot sqlite puts the timestamp in the file.

from geopandas.

jorisvandenbossche avatar jorisvandenbossche commented on June 2, 2024

Thanks @bretttully I can indeed reproduce that, it works with fiona (both with released geopandas as with geopandas main), and as we know it does not yet work with pyogrio (geopandas/pyogrio#249)

From a quick test, current fiona does not allow to append (mode="a") for writing to a file-like object.

Fiona allows you to write for a multi-file driver like Shapefile, but then reading the resulting bytes doesn't work (at least not easily by just passing a stream):

In [13]: with io.BytesIO() as stream:
    ...:     data.to_file(stream, driver="ESRI Shapefile", engine="fiona")
    ...:     bytestr = stream.getvalue()
    ...: 

In [14]: with io.BytesIO(bytestr) as stream:
    ...:     data2 = gpd.read_file(stream, engine="fiona")
    ...: 
---------------------------------------------------------------------------
CPLE_OpenFailedError                      Traceback (most recent call last)
...

File fiona/ogrext.pyx:143, in fiona.ogrext.gdal_open_vector()

DriverError: '/vsimem/9d4fe4810f7c446898a9875a739fbebf' not recognized as a supported file format.

from geopandas.

brendan-ward avatar brendan-ward commented on June 2, 2024

This is now implemented in pyogrio 0.8.0; wheels are on PyPI / conda forge.
(note: append to existing GPKG in memory / multiple layers are not yet supported)

from geopandas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.