Code Monkey home page Code Monkey logo

rero-mef's Introduction

rero-mef's People

Contributors

rerowep avatar pascalrepond avatar jma avatar garfield-fr avatar reropag avatar

Stargazers

Julien Sicot avatar Lauren-D avatar Pablo Iriarte avatar Ettore Rizza avatar Oleh:g Lavrovsky avatar Lionel Walter avatar

Watchers

 avatar  avatar James Cloos avatar Lionel Walter avatar Nicolas Prongué avatar Benoit Erken avatar  avatar Nicolas Labat avatar  avatar  avatar Lauren-D avatar  avatar

rero-mef's Issues

gnd_pid / pid

The JSON instance of a MEF GND person has two properties for the same identifiers:

  • gnd_pid
  • pid

gnd_pid is not in the JSON schema.

The same problem applies to BNF, RERO and VIAF persons.

The model should be simplified in order to have only one pid property used for persons from any sources.

Variant name for person bad processed

Compare the record of Reginon de Prüm, 0840?-0915:

The source record has, in Anderen Namen, the following:

Regino, Abbas Prumiensis
Regino
Prüm, Regino von
Réginon, d'Altripp
Réginon, de Spire
Réginon, de Prüm
Regino, Abbas
Reginone, da Prüm
Rhegino, Prumensis
Rhegino, Prumiensis
Regino, Prumensis
Regino, Monachus Prumiensis
Regino, of Prüm
Regino, von Prüm

But in MEF, we end up with:

            "variant_name_for_person": [
              "Regino",
              "Regino",
              "Prüm, Regino von",
              "Réginon",
              "Réginon",
              "Réginon",
              "Regino",
              "Reginone",
              "Rhegino",
              "Rhegino",
              "Regino",
              "Regino",
              "Regino",
              "Regino"
            ]
          }

This results with something like this in RERO ILS:

capture-19-07-09-14-04-55

JSON reference has not the right hostname

The $ref of the mef records not pointing to the right host.

Example the actual host is mevdev.test.rero.ch:

{
  "created": "2019-11-21T10:12:41.374924+00:00",
  "id": 1562319,
  "links": {
    "self": "https://mefdev.test.rero.ch/api/mef/1562319"
  },
  "metadata": {
    "$schema": "http://mef.test.rero.ch/schemas/authorities/mef-person-v0.0.1.json",
    "bnf": {
      "$ref": "http://mef.test.rero.ch/api/bnf/11889763"
    },
    "pid": "1562319",
    "rero": {
    "$ref": "http://mef.test.rero.ch/api/rero/A002955934"
  },
  "viaf_pid": "2465924"
  },
  "updated": "2019-11-21T10:12:41.374924+00:00"
}

Initial Update

The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.

The last generated Mef record is listed twice in the csv files

When using the cli with the option "bulk_load" to create the "mef" csv files, the last generated Mef record is listed twice in the csv files.

pipenv run invenio records bulk_load ${AGENCY} ${INPUT_DIR} ${OUTPUT_DIR} --json_file viaf.txt --csv_pidstore_file ${AGENCY}_pidstore.csv --csv_metadata_file ${AGENCY}_metadata.csv --rero_pids rero.pids

Review marctojson code

Review the code for the marctojson module.

  • use of click
  • base class
  • integration into dojson
  • ...

csv creation

If you are using regexp it is better to compile the re for multiple use, but why you are not using split('\t') (line 368) and split('|') (before line 380) in utils.py?

Line 305 in utils.py not necessary if you use split.
Line 389 in utils.py I do not understand why!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.