Code Monkey home page Code Monkey logo

Comments (7)

djohnson729 avatar djohnson729 commented on August 30, 2024

At a high level, this request involves a couple of major enhancements to MrGeo:

  • Support for image querying
  • More robust map algebra syntax that allows for arrays, looping and conditional statements

Ideally, the core code for storing and querying of image metadata would be handled outside of MrGeo so that the MrGeo code base can remain focused on image processing. However, the ability to query for images and perform map algebra based analytics on the results should be part of the core MrGeo. One approach to this is to create a query abstraction layer that allows for querying external databases to get a list of images matching the query criteria. It is also possible that the abstraction layer could include a method for informing the external database(s) when images are created (ingested) or deleted so that the database can be updated accordingly. It might make sense to include that interface in the data provider architecture, but that needs to be investigated. MrGeo should discover implementors of this interface through the Java Service Provider Interface (SPI) so that plugins can be provided dynamically in the user environment.

We would need to create a new map algebra function for querying images which would use the image query interface to perform the actual query (thus initiating the query on the external database). We would like to make the query generic in the sense that the caller could specify not only a query expression, but also a set of attributes to be returned for each match.

Map algebra would require significant modification to support this use case. The concept of arrays and something like "objects" or maybe "maps of attributes" would have to be added since that would be the resulting data from the query. Also required is more robust syntax that allows for looping and conditional statements. This would enable writing map algebra to properly process the results from the query.

As a side note, the results of the query operation seem very similar in nature to vector data in that it can have a spatial attribute and a set of other numeric or string attributes, so maybe this could be viewed as a vector query operation that returns a list of matching features, and modify map algebra to be able to access the feature attributes.

from mrgeo.

techchrj avatar techchrj commented on August 30, 2024

I also believe that this request will require the following things to be able to handle the metadata query abstraction:

  1. Shouldn’t MrGeo handle the storing of metadata? It already handles the tiling of the data and storage into Accumulo or HDFS. I would think the storing of the image metadata (for querying later) would just be another option, like the ability to skip the building of the image pyramid.
  2. A web based interface, something like REST, is needed to be able to query MrGeo for data and return the results. This may or may not be handled outside of MrGeo, but I thought I would mention it here for completeness.
  3. Need ability to support images stored in HDFS or NoSQL database (like Accumulo or Hbase). MrGeo already supports these as data providers, so I would think this would be trivial. Does map algebra already handle being able to pull the image data from either one of these?
  4. Since it will be web based, speed is of the essence. Need to look into utilizing Apache Spark in design/implementation on data retrieval.
  5. Doesn’t MrGeo already handle image querying (to some degree), via the bounding box option?
  6. Based on your write up above, it sounds like the map algebra functionality additions will take some time.

from mrgeo.

djohnson729 avatar djohnson729 commented on August 30, 2024

Answers to previous comment:

  1. We believe it is important to leave the MrGeo code base focused on its core mission. Understanding that image querying is an important feature, it is not necessary to that core mission, but rather a very useful add-on. There is a lot of logic involved in populating a metadata data store given that fact that there aren't any standards for storing metadata in imagery (with the possible exception of NITF). So we would like to keep that metadata logic outside of "MrGeo proper" and instead define a service provider interface through which MrGeo can query those outside data sources. This keeps a clean delineation between MrGeo and metadata storage/querying, and it provides a lot of flexibility to customers to store their metadata how they want.
  2. Agreed. For metadata, this would be provided by the metadata storage solution. And if such a REST API existed, the MrGeo service provider plugin for that data source could make use of that REST API for querying. MrGeo has a REST interface for a number of things which is now documented in here.
  3. By "images", I assume you are referring to ingested images, aka MrsPyramids, rather than source files (such as tiffs)? Both HDFS and Accumulo data providers support storing MrsPyramid images.
  4. We agree. We have begun using Apache Spark for new functionality within MrGeo. In fact, we are making some changes to the image ingest processing right now and changing it to use Spark instead of map/reduce as part of those changes. We will continue migrating to Spark moving forward. We have a prototype of the CostDistance capability implemented in Spark as well, but that is not yet ready to be rolled out.
  5. I'm not sure which bounding box option you are referring to? In map algebra, there is no query capability. This new feature will provide that capability so that analytics (which are done via map algebra scripting) can be performed against images that match a query.
  6. Yes agreed, and we are looking at leveraging Scala for parsing and executing map algebra. We began working on that some time ago, but that work took a back seat to other priorities since that time.

from mrgeo.

techchrj avatar techchrj commented on August 30, 2024

What is time line for rolling out the Spark processing?

By "images", I was referring to ingested images, so your answer clears it up.

When I referenced the bounding box option, I was referring to the option that is available from the command line when using the export functionality. Doesn't that bounding box option infer some kind of query capability? If so, can it be extended to work with what we are discussing in this topic?

Understand your rationale as to why you want to keep MrGeo code focused on its core mission and add query as a feature add-on. I forgot about all of the non-standards when it comes to imagery and metadata.

from mrgeo.

djohnson729 avatar djohnson729 commented on August 30, 2024

We are working on Spark processing right now, and we're hitting some roadblocks, so it's hard to say how much longer it will take. But it is actively being worked.

The bounding box argument in the export is only used to limit the region of a MrsPyramid that is exported to tiff (or jpeg, etc...). It doesn't perform a query to see which images intersect that bounding box. You have to tell export which image you want to export from.

from mrgeo.

techchrj avatar techchrj commented on August 30, 2024

Where do we go with this from here?

from mrgeo.

djohnson729 avatar djohnson729 commented on August 30, 2024

@ttislerdg wrote a GeoServer plugin for MrGeo resolves this. Accessible at https://github.com/ngageoint/mrgeo-geoserver-plugin

from mrgeo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.