Comments (10)
It actually looks like the error might be on the save, not on the read. The following fails.
import io
import os
from pathlib import Path
import pyogrio
import pyogrio.raw
from shapely import Polygon
import geopandas as gpd
from geopandas.testing import assert_geodataframe_equal
os.environ["PYOGRIO_USE_ARROW"] = "1"
gpd.options.io_engine = "pyogrio"
gpd.show_versions()
data = gpd.GeoDataFrame(
[
{"foo": 1, "bar": "a", "geometry": Polygon([(0, 0), (0, 1), (1, 1)])},
{"foo": 2, "bar": "b", "geometry": Polygon([(0, 0), (0, 2), (2, 2)])},
{"foo": 3, "bar": "c", "geometry": Polygon([(0, 0), (0, 3), (3, 3)])},
],
geometry="geometry",
crs="EPSG:4326",
)
outpath = Path("tmp.gpkg")
if outpath.exists():
outpath.unlink()
data.to_file(outpath, layer="geometry", driver="GPKG")
assert outpath.exists()
bytestr_from_file = outpath.read_bytes()
with io.BytesIO() as stream:
data.to_file(stream, layer="geometry", driver="GPKG")
bytestr = stream.getvalue()
assert bytestr == bytestr_from_file, f"{len(bytestr)=} != {len(bytestr_from_file)=}"
AssertionError: len(bytestr)=0 != len(bytestr_from_file)=98304
from geopandas.
Thanks @bretttully for the report, this is currently the case - bytesIO can't be written to, see geopandas/pyogrio#249 (and discussion in #2875). We should note this as a difference between fiona and pyogrio that could break people in 1.0
from geopandas.
Oh, thanks @m-richards -- that would be a fairly large regression for us... We could work around by writing to a temp file and then reading to bytes back in, but that wouldn't be great.
from geopandas.
Thanks @bretttully, this is a good feedback to have! I suppose you're not the only one using BytesIO as intermediate files.
@jorisvandenbossche @brendan-ward @theroggy what is the feasibility of getting this to pyogrio 0.8 before geopandas 1.0 lands?
from geopandas.
I've been looking into this based on how it is implemented in Fiona / rasterio and working toward a potential PR. Not sure about the timing because there are some complexities here to work out (GPKG append / add layers to memory stream). Will continue the discussion on the pyogrio side.
from geopandas.
@bretttully can you post the output of geopandas.show_versions()
of an environment where this actually works, when using Fiona?
from geopandas.
SYSTEM INFO
-----------
python : 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
executable : /opt/conda/bin/python
machine : Linux-4.14.336-257.562.amzn2.x86_64-x86_64-with-glibc2.35
GEOS, GDAL, PROJ INFO
---------------------
GEOS : 3.12.1
GEOS lib : None
GDAL : 3.8.4
GDAL data dir: /opt/conda/share/gdal
PROJ : 9.3.1
PROJ data dir: /opt/conda/share/proj
PYTHON DEPENDENCIES
-------------------
geopandas : 0.14.3
numpy : 1.26.4
pandas : 2.2.2
pyproj : 3.6.1
shapely : 2.0.4
fiona : 1.9.6
geoalchemy2: None
geopy : 2.4.1
matplotlib : 3.8.4
mapclassify: 2.6.1
pygeos : None
pyogrio : 0.7.2
psycopg2 : 2.9.9 (dt dec pq3 ext lo64)
pyarrow : 13.0.0
rtree : 1.2.0
Code:
import io
from pathlib import Path
import geopandas as gpd
from geopandas.testing import assert_geodataframe_equal
from shapely import Polygon
gpd.show_versions()
data = gpd.GeoDataFrame(
[
{"foo": 1, "bar": "a", "geometry": Polygon([(0, 0), (0, 1), (1, 1)])},
{"foo": 2, "bar": "b", "geometry": Polygon([(0, 0), (0, 2), (2, 2)])},
{"foo": 3, "bar": "c", "geometry": Polygon([(0, 0), (0, 3), (3, 3)])},
],
geometry="geometry",
crs="EPSG:4326",
)
outpath = Path("tmp.gpkg")
if outpath.exists():
outpath.unlink()
data.to_file(outpath, layer="geometry", driver="GPKG")
assert outpath.exists()
bytestr_from_file = outpath.read_bytes()
with io.BytesIO() as stream:
data.to_file(stream, layer="geometry", driver="GPKG")
bytestr = stream.getvalue()
assert len(bytestr) == len(bytestr_from_file), f"{len(bytestr)=} != {len(bytestr_from_file)=}"
with io.BytesIO(bytestr) as stream:
data2 = gpd.read_file(stream, driver="GPKG")
assert_geodataframe_equal(data, data2)
from geopandas.
Note the change of assert bytestr == bytestr_from_file
to assert len(bytestr) == len(bytestr_from_file)
-- I forgot sqlite puts the timestamp in the file.
from geopandas.
Thanks @bretttully I can indeed reproduce that, it works with fiona (both with released geopandas as with geopandas main), and as we know it does not yet work with pyogrio (geopandas/pyogrio#249)
From a quick test, current fiona does not allow to append (mode="a"
) for writing to a file-like object.
Fiona allows you to write for a multi-file driver like Shapefile, but then reading the resulting bytes doesn't work (at least not easily by just passing a stream):
In [13]: with io.BytesIO() as stream:
...: data.to_file(stream, driver="ESRI Shapefile", engine="fiona")
...: bytestr = stream.getvalue()
...:
In [14]: with io.BytesIO(bytestr) as stream:
...: data2 = gpd.read_file(stream, engine="fiona")
...:
---------------------------------------------------------------------------
CPLE_OpenFailedError Traceback (most recent call last)
...
File fiona/ogrext.pyx:143, in fiona.ogrext.gdal_open_vector()
DriverError: '/vsimem/9d4fe4810f7c446898a9875a739fbebf' not recognized as a supported file format.
from geopandas.
This is now implemented in pyogrio 0.8.0; wheels are on PyPI / conda forge.
(note: append to existing GPKG in memory / multiple layers are not yet supported)
from geopandas.
Related Issues (20)
- BUG: The dissolve feature generates a new geometry HOT 4
- REGR: incorrect order of left sjoin with within predicate HOT 1
- BUG: wrong foxpro DBF file read HOT 7
- Fiona 1.10a2 issues HOT 2
- ENH: support writing + filtered reading from bbox columns in GeoParquet HOT 4
- ENH: support reading and writing the geoarrow-based encodings of GeoParquet
- API: Series alignment for non binary op methods HOT 5
- PERF: optimize `==` and `!=` for GeoSeries (`GeometryArray.__eq__`)
- numpy dependency missing & numpy 2.0 support status? HOT 3
- ENH: inplace=True for .dissolve and .explode? HOT 4
- BUG: Issue with webpage code interp HOT 1
- ENH: sjoin() should allow to return the distance when using the dwithin predicate, just like sjoin_nearest() HOT 3
- BUG: set_precision() doesn't work HOT 1
- ENH: CRS from GeoDataFrame not passed on to individual geometry objects HOT 3
- BUG: Unable to open files with fiona where filepath contains a # character HOT 3
- BUG: GeoJSON file from URL not recognized as a supported file format HOT 17
- BUG: append new df to old GDB file failed HOT 4
- MAINT: Update the conda forge feedstocks for 1.0 HOT 2
- ENH: Speedup to overlay/identity HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geopandas.