juliaearth / geotables.jl Goto Github PK
View Code? Open in Web Editor NEWGeospatial tables compatible with the GeoStats.jl framework
License: MIT License
Geospatial tables compatible with the GeoStats.jl framework
License: MIT License
We have an example.parquet file and need to test the GeoParquet.jl backend with it. I remember spotting some issues with the loaded geometries from this file.
I have a problem loading a GeoPackage dataset (https://drive.google.com/file/d/16--kcGrC56zayrK04-Brmfo3ImOwtQ5k/view?usp=sharing). The dataset can be loaded using ArchGDAL.
import ArchGDAL as AG
using DataFrames
stations = @chain "stations.gpkg" begin
AG.read()
AG.getlayer(0)
DataFrame()
end
However, I can not load it using GeoTables.
using GeoTables
@time stations = GeoTables.load("stations.gpkg")
The previous code does not finish to be executed. I am using the latest version of GeoTables.jl
.
We need to add performance tests to make sure that we are not materializing geometries unintentionally.
Currently we use Shapefile.jl and GeoJSON.jl to write geotables to disk because they both implement the Tables.jl interface. Perhaps ArchGDAL.jl also allows writing Tables.jl to disk and we should fallback to it.
This package will be more useful when GeoTable support writing features.
Tables.jl provides a manual of implementation: https://tables.juliadata.org/stable/#Tables.isrowtable
I remember seeing a GeoTables.gadm
function, but that doesn't seem to be around anymore. Doing
using GADM
using GeoTables
bra = GADM.get("BRA"; depth=2)
GeoTable(bra) # or
GeoTable(bra.geom)
throws the following Error:
ERROR: type NamedTuple has no field geometry
Stacktrace:
[1] getproperty
@ ./Base.jl:37 [inlined]
[2] getcolumn
@ ~/.julia/packages/Tables/NSGZI/src/Tables.jl:102 [inlined]
[3] GeoTable(table::@NamedTuple{…})
@ GeoTables ~/.julia/packages/GeoTables/XDtOm/src/abstractgeotable.jl:52
[4] top-level scope
@ REPL[49]:1
What would be the recommended way to get a GeoTable
from GADM data?
We refactored the Chain interface and some of the tests are now breaking. We need to take a closer look and fix.
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
We need to add examples with all file formats to precompile code during package installation.
Some other Tables.jl
implementations allow for a view
when working in VS Code, specially DataFrames.jl
. If using the Julia extension for VS code, and a DataFrame
object is loaded into memory, it can be seen in the Julia REPL Workspace with a little eyeball icon next to it. When clicking this it loads the table into a viewer in the VS Code editor where it can be explored visually. This is super useful when just trying to look over the whole data set for sanity checks or discussions with colleagues. Not sure if this is implemented on the tables side or the VS Code side. My inclination is this feature should work with Tables.jl
implementations.
The tests in the "save" test set are failing with some combinations of file formats. We need to fix them all.
I see you're using GeoInterface to convert the Shapefile geometries to Meshes. Note that that code should also work the same on other geometries, such as the ArchGDAL geometries used in https://github.com/evetion/GeoDataFrames.jl.
Since GDAL has much better IO coverage than only the Shapefile that is currently supported here, it would be nice to add a conversion function. No additional dependencies needed, users could just
using GeoDataFrames, GeoTables
df = GeoDataFrames.read("rivers.gpkg")
table = GeoTable(df) # or another method
The build is failing due to some issue related to DataDeps.jl
Sometimes the data generated by other GIS software are invalid with first(vertices) != last(vertices). We need to check if this condition is satisfied and repeat the first vertex manually when the backend package doesn't do that for us.
MWE:
julia> georef((; x=[missing]), [(0, 0)])
Error showing value of type GeoTable{Meshes.PointSet{2, Float64}, NamedTuple{(:x,), Tuple{Vector{Missing}}}}:
ERROR: MethodError: no method matching nameof(::Type{Union{}})
Hi @juliohm !
I was updating PrettyTables.jl to use Tables.subset
when available, which should greatly improve the speed, and I think I found a problem in the implementation here.
Looking at the documentation, Tables.subset(table, 1)
should return the first row of the table. Hence, the code:
julia> r = Tables.subset(table, 1)
julia> e = Tables.getcolumn(r, 1)
should return the element at the position (1, 1)
.
Look what happens with GeoTables:
julia> table = georef((a=[1 2 3; 4 5 6; 7 8 9],))
9×2 GeoTable over 3×3 CartesianGrid{2,Float64}
┌─────────────┬─────────────────────────────────────────┐
│ a │ geometry │
│ Categorical │ Quadrangle │
│ [NoUnits] │ │
├─────────────┼─────────────────────────────────────────┤
│ 1 │ Quadrangle((0.0, 0.0), ..., (0.0, 1.0)) │
│ 4 │ Quadrangle((1.0, 0.0), ..., (1.0, 1.0)) │
│ 7 │ Quadrangle((2.0, 0.0), ..., (2.0, 1.0)) │
│ 2 │ Quadrangle((0.0, 1.0), ..., (0.0, 2.0)) │
│ 5 │ Quadrangle((1.0, 1.0), ..., (1.0, 2.0)) │
│ 8 │ Quadrangle((2.0, 1.0), ..., (2.0, 2.0)) │
│ 3 │ Quadrangle((0.0, 2.0), ..., (0.0, 3.0)) │
│ 6 │ Quadrangle((1.0, 2.0), ..., (1.0, 3.0)) │
│ 9 │ Quadrangle((2.0, 2.0), ..., (2.0, 3.0)) │
└─────────────┴─────────────────────────────────────────┘
julia> r = Tables.subset(table, 1); Tables.getcolumn(r, 1)
9×2 GeoTable over 3×3 CartesianGrid{2,Float64}
┌─────────────┬─────────────────────────────────────────┐
│ a │ geometry │
│ Categorical │ Quadrangle │
│ [NoUnits] │ │
├─────────────┼─────────────────────────────────────────┤
│ 1 │ Quadrangle((0.0, 0.0), ..., (0.0, 1.0)) │
│ 4 │ Quadrangle((1.0, 0.0), ..., (1.0, 1.0)) │
│ 7 │ Quadrangle((2.0, 0.0), ..., (2.0, 1.0)) │
│ 2 │ Quadrangle((0.0, 1.0), ..., (0.0, 2.0)) │
│ 5 │ Quadrangle((1.0, 1.0), ..., (1.0, 2.0)) │
│ 8 │ Quadrangle((2.0, 1.0), ..., (2.0, 2.0)) │
│ 3 │ Quadrangle((0.0, 2.0), ..., (0.0, 3.0)) │
│ 6 │ Quadrangle((1.0, 2.0), ..., (1.0, 3.0)) │
│ 9 │ Quadrangle((2.0, 2.0), ..., (2.0, 3.0)) │
└─────────────┴─────────────────────────────────────────┘
The answer here should be the (1, 1)
element in the table instead of the entire table.
If I try to print r
, I get the error:
julia> r
Error showing value of type GeoTables.SubGeoTable{GeoTable{Meshes.CartesianGrid{2, Float64}, NamedTuple{(:a,), Tuple{Vector{Int64}}}}, Int64}:
ERROR: ArgumentError: 'NamedTuple{(:a,), Tuple{Int64}}' iterates 'Int64' values, which doesn't satisfy the Tables.jl `AbstractRow` interface
Notice that everything works as expected for DataFrames:
julia> using DataFrames
julia> table = DataFrame([1 2 3; 4 5 6; 7 8 9],:auto)
3×3 DataFrame
Row │ x1 x2 x3
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 2 3
2 │ 4 5 6
3 │ 7 8 9
julia> r = Tables.subset(table, 1); Tables.getcolumn(r, 1)
1
julia> r = Tables.subset(table, 1); Tables.getcolumn(r, 2)
2
julia> r = Tables.subset(table, 1); Tables.getcolumn(r, 3)
3
Thus, without fixing this problem, GeoTables will print wrong results with the next PrettyTables.jl version.
The question is: is this a bug in GeoTables.jl or a wrong interpretation of Tables.jl by me?
As a side note, look how faster we are now when printing a huge DataFrame
with middle cropping:
# Before
julia> df = DataFrame(rand(10000,10000),:auto)
julia> @time display(df)
4.802733 seconds (190.15 M allocations: 8.217 GiB, 9.32% gc time)
# After
julia> @time display(df)
0.010285 seconds (124.94 k allocations: 5.454 MiB)
It should
gadm() function has been removed, so there should be an explanation how to implement the functionality in the current version. Especially the following does not work:
using GADM
using GeoTables
table = GADM.get(country; depth=depth)
gtable = GeoTable(table)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.