Code Monkey home page Code Monkey logo

Comments (6)

kylebarron avatar kylebarron commented on June 10, 2024

The GeoJsonReader accepts any input that implements Read. So you can pass in a File or, preferably, a BufReader<File>.

from geozero.

nk9 avatar nk9 commented on June 10, 2024

Thanks for the quick reply. I found this issue and adapted his code to do what I needed, along with this bit of their docs. I'm thinking that the answer to my question is "geojson is the a higher-level library, and is probably what you want to use for simply iterating through features in a geojson file." Is that safe to say?

Update: Thanks for the note about BufReader. That is an order of magnitude faster than just passing a File directly!

Update 2: I've achieved another order of magnitude speed-up by using rayon and placing the entire load_geojson function inside par_iter(). Down to 2.2 seconds to read 280 files containing 175k features.

For posterity, here's what I ended up with:

use geo::{MultiLineString, MultiPolygon};
use geojson::{Feature, GeoJson, Value};
use std::fs::File;
use std::io::BufReader;

fn load_geojson(path: &PathBuf) -> Result<(), Box<dyn std::error::Error>> {
    let file = File::open(path)?;
    let reader = BufReader::new(file);
    let geojson = GeoJson::from_reader(reader)?;

    match geojson {
        GeoJson::FeatureCollection(collection) => {
            for feature in collection.features {
                let _ = process_feature(&feature)?;
            }
        }
        _ => println!("Unsupported GeoJSON type"),
    }

    Ok(())
}

fn process_feature(feature: &Feature) -> Result<(), Box<dyn std::error::Error>> {
    // String value of the "name" property, or an empty string
    let name = feature
        .property("name")
        .and_then(|v| v.as_str())
        .map_or(String::from(""), |s| s.to_string());

    let geom = feature.geometry.as_ref().unwrap();

    match &geom.value {
        Value::MultiPolygon(_) => {
            let p: MultiPolygon<f64> = geom.value.clone().try_into().unwrap();
            println!("{p:?}");
        }
        Value::MultiLineString(_) => {
            let p: MultiLineString<f64> = geom.value.clone().try_into().unwrap();
            println!("{p:?}");
        }
        _ => panic!("not a recognized feature type"),
    };

    Ok(())
}

from geozero.

kylebarron avatar kylebarron commented on June 10, 2024

I'm thinking that the answer to my question is "geojson is the a higher-level library, and is probably what you want to use for simply iterating through features in a geojson file." Is that safe to say?

I'd say the opposite. geojson is lower-level in the sense that you have to handle specifics about GeoJSON to handle input data. geozero is higher-level in the sense that GeoJSON input is just one type of input, but can export to any consumer. For example in geoarrow-rs GeoJsonReader::process just works even though geozero has no knowledge of the GeoArrow output format.

I'd say the real issue is that there's no "default" library in georust for handling geometries with attributes. You can parse to geo structs but then you lose the associated attributes. This is a main feature of geoarrow though; being really optimized about both the geometries and their attributes.

from geozero.

nk9 avatar nk9 commented on June 10, 2024

OK, that's interesting to know. If this is the higher-level library, I think it's even more important to have some sample code of loading a GeoJSON file (ideally from disk, but at least from a string) and iterating its features. I never got that to work with geozero.

As for geometries and attributes, the geojson::Feature struct seems to handle that pretty well with feat.property("prop_key") and feat.geometry upon pulling them out of the file. But you're right, I had to create my own struct to store the converted geometry along with the properties I wanted. Providing a ready-made struct for that purpose would be a nice QoL improvement.

from geozero.

kylebarron avatar kylebarron commented on June 10, 2024

I never got that to work with geozero.

You'd need to impl your own GeozeroDatasource. It's a bit of work, which is why it's not done often; instead converting to existing representations instead of creating your own.

As for geometries and attributes, the geojson::Feature struct seems to handle that pretty well with feat.property("prop_key") and feat.geometry upon pulling them out of the file

Sure, but that's storing attributes in the GeoJSON model, which is quite restrictive. For example, you can't store a date time in GeoJSON; you can only store a string.

Providing a ready-made struct for that purpose would be a nice QoL improvement.

It's not geozero's concern to provide those structs. Geozero focuses only on conversions between representations. I'm building a representation around Arrow, which enables storing properties quite efficiently, but does incur some large dependencies.

from geozero.

nk9 avatar nk9 commented on June 10, 2024

OK, well I filed this bug to document that I wanted to do something I thought was very common, and yet could find no sample code on how to do it. I've solved my problem at this point. If you want people to actually use this library for parsing features and properties out of geojson files, especially given that it's nonobvious how to do that, then having some sample code would really help and I'd suggest this bug should stay open.

But if people looking for a simple way to iterate through features/properties in a GeoJSON files should just use geojson instead, which already has sample code for this, then that's fine too. I'd still suggest that it would be friendly to new devs to put a pointer about this in the docs, but that's up to you.

Thanks for engaging with me on this, just trying to make things a little easier for the next guy or gal. :-)

from geozero.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.