Code Monkey home page Code Monkey logo

dlme-metadata's Introduction

DLME Metadata

This repo contains records for the DLME based (loosely) on major content types. Content types are in their own directories, with scripts for processing metadata from various hosts in them.

Maps

Map data related to the DLME project. These include metadata records from Harvard, Princeton, MIT, and Stanford.

Coins

Data from the Numismatic Society's collection of coins are in CSV format from:

http://numismatics.org/search/query.csv?q=department_facet:%22Islamic%22

Under the https://opendatacommons.org/licenses/odbl/ license

dlme-metadata's People

Contributors

waynegraham avatar cmharlow avatar

Stargazers

Matthias Vandermaesen avatar María A. Matienzo avatar

Watchers

 avatar James Cloos avatar

dlme-metadata's Issues

Aga Khan Visual Archive

The Aga Khan Visual Archive is a rich collection to target for this, but do not provide an easily accessible interface to grab the records. Need to contact folks at MIT to see if we can get a dump of these records. Otherwise, develop a crawler that starts at

https://dome.mit.edu/handle/1721.3/45936/browse?type=dateissued

which gets records like this:

https://dome.mit.edu/handle/1721.3/37656?show=full

This is a DC metadata with thumbs at

https://dome.mit.edu/bitstream/handle/1721.3/37656/126635_cp.jpg?sequence=1

Dome is written on DSpace, so hopefully there's at least OAI

Princeton Movie Posters

Get the entire collection of movie posters from Princeton beyond the sample MODS records already in the repository.

Penn Museum Egyptian CSV Error

There's an error in the expanded Penn Museum Egyptian CSV in processing for DLME:

CSV::MalformedCSVError: Unquoted fields do not allow \r or \n (line 2).

Weed Spatial Data

Spatial data was taken from a query of a bounding box of the MENA region and includes some items that are not particularly interesting to this endeavor. Go through and weed out vector datasets.

American Numismatic Society Metadata Issue

The Department column in the downloaded file has the department field set as "Islamic". This needs to be updated to be "American Numismatic Society" as part of the download/massage process.

Map IDs

Right now the Map functionality comes from a spatial query from Stanford's GeoBlacklight instance (https://earthworks.stanford.edu/?bbox=28.959961+20.715015+47.592773+38.891033&per_page=100) and manually paging through maps within a bounding box (28.959961 20.715015, 47.592773 38.891033). Specific maps are then "found" (with mechanize and added to an array of ids to look up in the OpenGeoData layers information.

This is an unscalable system that will break.

We need a system that will do the following:

  • Work with a group to identify appropriate bounding box for maps
    • Include time-boxed expansions (e.g. Italy, Spain, Japan, etc.)
  • Create a more efficient method of retrieving pointers to resources than looking up the JSON identifiers
  • Method for synchronizing the metadata records between institutions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.