Code Monkey home page Code Monkey logo

Comments (4)

karthik avatar karthik commented on June 13, 2024

We could use this issue to highlight other rOpenSci packages. Download data that cover these different types of coverage and document those. Also what other major issues need to be worked on before this milestone?

from eml.

emhart avatar emhart commented on June 13, 2024

I think this is a great idea and something to incorporate into spocc (née occdat) Right now it's written with S4 classes, but we can write the dataset with EML that automatically incorporates those elements using reml. I'll open an issue on the other package.

from eml.

cboettig avatar cboettig commented on June 13, 2024

Writing coverage nodes

A coverage node for the dataset can be generated either using the constructor function eml_coverage, for commonly specified formats, or using the richer but lower-level constructors, new("coverage", ...), new("geographicCoverage, ...), etc. eml_write technically takes the S4 object as an argument, though users may find it easier to:

  1. call the constructor inline in the function call:
eml_write(dat, metadat, 
               coverage = 
               eml_coverage("Sarracenia purpurea", 
                                     dates = c("2012-06-01", "2012-06-01"),
                                     geographic_description = "Harvard Forest Greenhouse, Tom Swamp Tract (Harvard Forest)",
                                     c(north=42.55,  south=42.42 , east=-72.10, west=-72.29, min_alt=160, max_alt = 330)))
  1. or: the same could be accomplished, along with more complex definitions allowed in EML, with the manual constructors. For instance, generating the same temporalCoverage node above can be done by calling the constructor, and then writing into the desired slots:
temporal = new("temporalCoverage")
temporal@rangeOfDates@beginDate@calendarDate = "2012-06-01"
temporal@rangeOfDates@endDate@calendarDate = "2012-06-01"

We can then use slotNames to see what other options are available. For instance, slotNames(temporal@rangeOfDates@beginDates) shows us that we can have a "time" as well as a "calendarDate". This is more flexible, but requires a bit more exploring or familiarity with S4 structures than using the simple eml_coverage function shown above (along with it's associated documentation). Not sure if it is worth extending our eml_coverage function to allow a user to construct arbitrary coverage nodes without needing @ references

  1. We could also extract a coverage node from an existing file as shown below, and simply pass that to write_eml to reuse it in a new file.

Extract coverage nodes

Not sure what the preferred format for extracting coverage metadata from an EML file is. Currently, we have an accessor method to extract coverage node:

eml <- eml_read("my_eml.xml") 
coverage <- coverage(eml)

This is equivalent to eml@dataset@coverage, so it is really just a convenience function. We do define a pretty-print method based on yaml, showing only non-empty fields:

> coverage
coverage(eml)
geographicCoverage:
  geographicDescription: Harvard Forest Greenhouse, Tom Swamp Tract (Harvard Forest)
  boundingCoordinates:
    westBoundingCoordinate: '-72.29'
    eastBoundingCoordinate: '-72.1'
    northBoundingCoordinate: '42.55'
    southBoundingCoordinate: '42.42'
    boundingAltitudes:
      altitudeMinimum: '160'
      altitudeMaximum: '330'
      altitudeUnits: meter
temporalCoverage:
  rangeOfDates:
    beginDate:
      calendarDate: '2012-06-01'
    endDate:
      calendarDate: '2013-12-31'
taxonomicCoverage:
  taxonomicClassification:
    taxonRankName: genus
    taxonRankValue: Sarracenia
    taxonomicClassification:
      taxonRankName: species
      taxonRankValue: purpurea 

Because it may also be useful to have a list format used by eml_config, currently this coverage element can simply be coerced into that list format:

> as(coverage, "list")
$scientific_names
[1] "Sarracenia purpurea"

$dates
[1] "2012-06-01" "2012-06-01"

$geographic_description
[1] "Harvard Forest Greenhouse, Tom Swamp Tract (Harvard Forest)"

$NSEWbox
  north   south    east    west min_alt max_alt 
  42.55   42.42  -72.10  -72.29  160.00  330.00 

Note that this is not a generic list conversion, but rather always into these four summary elements for convenience. Users wanting the full structure should probably subset from the S4 class directly. These are the very same short-hand arguments used by the constructor function eml_coverage, e.g. one can do:

cov_list <- as(coverage, "list")
S4 <- do.call(eml_coverage, cov_list)

which returns the original S4 version of the coverage element.

Open for feedback on interface choices here. I'm always divided on how to write these helper functions so that they are intuitive but still flexible, instead of making the user manually navigate the data structure with all the @s.

from eml.

cboettig avatar cboettig commented on June 13, 2024

Coverage nodes are implemented. Basic writing of a coverage node is shown in the Advanced writing EML vignette

from eml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.