Code Monkey home page Code Monkey logo

e-ark-aip's People

Contributors

carlwilson avatar jmaferreira avatar karinbredenberg avatar kuldaraas avatar shsdev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

e-ark-aip's Issues

Pointless recommendation or wrong example

We recommended to use "as a prefix an internationally recognized standard identifier for the institution from which the SIP originates. This may lead to problems with smaller institutions, which do not have any such internationally recognized standard identifier. We propose in that case, to start the prefix with the internationally recognized standard identifier of the institution, where the AIP is created, augmented by an identifier for the institution from which the SIP originates."

This sentence doesn’t make any sense… as right after, the example given discards the entire recommendation.

/mets/@OBJID="urn:uuid:123e4567-e89b-12d3-a456-426655440000"

I would remove this entire paragraph. The recommendation should be to use an international standard schema for identifiers (not institutions IDs)

Does updating original submission lead to forks?

Review comment (Ch. 5.2.3): Does updating original submission actually mean that we can eventually have different forks of the metadata? E.g. when first creating a new representation of the package or its metadata, and the updating the original submission?

AIP export functionality does not reduce costs

Review comment (Chapter 3): We believe that AIP export functionality does not really reduce costs, since the destination repository needs to handle the exported AIP in a same way as a new SIP to update its internal data structures. Further, the destination repository may need two ingest workflows; one for SIPs and one for AIPs which most likely increases development costs.

Incorrect file name used in figure 11

Review comment: Figure 11 (rep-001): In METS.xml: "/submission/representations/file.odt" should be "/submission/representations/rep-001/file.odt". Same with "rep-002".

All extra elements for SIP, AIP and DIP should be included in CSIP

In Sweden we have not created separate FGSs for SIP, AIP and DIP. We recommend that all extra elements for SIP, AIP and DIP should be included in CSIP, but as optional information. Then, documents are needed that describe how to create a SIP, AIP, DIP and how to use the elements at EU level and that it is possible to adapt for local use in different countries.

Remove references to E-ARK or add a section explaining what E-ARK was

Section 2 starts with the following sentence "This AIP format specification is based on E-ARK deliverable, D4.4 “Final version of SIP-AIP conversion component”. It relates to part A of this deliverable which is the AIP format specification. "

Mentioning this without a chapter describing what E-ARK was is, in my view, pointless. This is a spec.. not a project deliverable. Context must be given.

METS Profile based requirements definition

Requirements related to the AIP format are not derived from a Mets profile. Requirements need to be defined in a METS profile, and the profile must be used for generating the requirements tables to be included in the AIP specification.

There might be a wrong enforcement in the Requirement 14 (page 16)

Requirement 14. The root directory of the package MUST contain a “submission” directory which is a container for the original submission and might eventually contain SIP updates which are submitted after the AIP was created.

I might be wrong here... but shouldn’t this be a COULD? I read before that the submission folder is optional.

Incorrect file name in figure 11

Review comment (Figure 11): In METS.xml: "/submission/representations/file.odt" should be "/submission/representations/rep-001/file.odt". Same with "rep-002".

Need for support of versions and variants of AIP

In the specification it is stated that AIP is continuously supplied with conservation metadata and this is what distinguishes the package type from SIP / DIP. In the practical archival care, we currently manage generations of AIP, where care efforts in the form of, for example, conversion to conservation format, lead to the creation of a new IP and where the relation to original IP needs to be preserved. If necessary, generations are preserved and, where appropriate, generations IP will be thrown away. There is also a need for variants of AIP (AIC /AIU). We have not studied the specifications in detail but would like to emphasize the need for support for different versions and variants of AIP.

Explain why we need a Package manifest

See section 5.4.1 on page 43.

Provide information on why exactly we should care to implement this! Just add a paragraph that explains the motivation to do it.

Enhance the introduction with an explanation of the purpose of the document and the AIP

In this document we fail to explain the main purpose of having a AIP format, which is to mitigate a potential preservation risk of institutional or repository meltdown and implement in a simple way a repository succession strategy.

Repository systems are not expected to implement this AIP format, however, they are expected to be able to generate it it in a simple way for a set of AIPs.

Proposal of adding Retention/Disposal elements: Proposed element Preservation date

Proposed element Preservation date. Date after which disposal of AIP shall occur if PreservationStatus="DI". That is preservation shall only occur up to and including this date, i.e. package shall be retained to this date. E.g. ”2020-01-01” means that the AIP shall be destroyed (directly) after this date (or retained to this date depending on ones viewpoint.). Element type: YYYY-MM-DD. Shall be given if PreservationStatus="DI". Otherwise it is optional.

Directory structure is too complex

Review comment: the directory structure is too complex. For example, representations in /AIP/submission/representations are in different level than in /AIP/representations.

Reference to the wrong section

In the sentence "We will explain in section 3.3.1 in more detail how the referencing of METS.xml files must be implemented if this alternative is chosen." (page 13), I don’t think this section number is correct. I’m not sure which section is the correct one.

Consistent use of symbols in identifiers

Review comment (Page 37): Using plus sign (+) instead of colon (:) might not be a good idea, since in HTTP protocols, space character is usually encoded as plus sign. REST interfaces typically used in machine to machine communication use HTTP protocol.

Missing scope of AIP specification

The main purpose of having a AIP format is not explained sufficiently (in "Scope of this document").

  • Mitigate a potential preservation risk of institutional or repository meltdown and implement in a simple way a repository succession strategy.

  • Repository systems are not expected to implement this AIP format, however, they are expected to be able to generate it it in a simple way for a set of AIPs.

Originally raised by @jmaferreira

Other characterisation outpout possible?

Review comment (Page 31): AIP-PREMIS-CHARACTERIZATION: It is stated that JHove output could be embedded, but optionally also other similar outputs from other software too?

Sentence is difficult to understand

When an AIP is created during ingest, it receives an unalterable identifier, which defines the AIP as one consistent logical entity. This identifier is also used to derive the name of the physical storage container.

This is so short that hardly deserves its own section. Nonetheless, what exactly does this sentence mean?! Does it mean that we should name the folder with the same name as the identifier? If so, just give that example.

Uniform use of folder and directory

Words "folder" vs. "directory" should be uniform. Technically, "folder" is different from "directory" (folder may be a physical or a virtual folder, whereas directory is always physical directory in a file system).

Proposal of adding Information security classification: Proposed element Classification

Proposed element Classification. Security classification of information in package. P=Publik (eng. public or declassified), BS=Begränsat Skyddsvärde (eng. restricted), HS=Högt Skyddsvärde (eng. confidential), EJ KLASSAT=Ej Klassat, okänd klassning (eng. unclassified, unknown). Elements type: String. Allowed values: P, BS, HS, EJ KLASSAT. "EJ KLASSAT" is default. Values could be extended to support different organisations requirements.

PREMIS profile requirements

Requirements related to preservation metadata are not formally defined. A PREMIS profile needs to be created which allows validators to include the verification of these requirements.

Documents for the different IPs should be standardized

The documents for the different IPs should be standardized so that they are structured in a similar way and the references should keep the same ID for the same element (CSIP7 should always be the same element in all documents).

Better examples are needed on the file IDs

Example on page 27 for a file ID

ID77146c6c-c8c3-4406-80b5-b3b41901f9d0

This example does not follow the recommendation provided earlier. It should be
urn:uuid:77146c6c-c8c3-4406-80b5-b3b41901f9d0

IPs or SIPs?

Requirement 16. If the “submission” folder contains one or several sub-folders, the sub-folders MUST contain IPs.

Should it be IPs or SIPs?

In a physical AIP, were representations ever considered as being the root?

Hi,

I think the E-ARK specs are really wonderful. One thing I'm trying to get my head around is that the root of the E-ARK AIP is the Intellectual Entity. I see why this makes sense from a logical sense, and even from a physical sense. However, I think that there is a potential valid reason for having a representation as the root.
for example:

  • An archive receives a DCP of a film work. This is representation 1. A year later, they receive another representation in the form of a DCDM. This would require a new version of the AIP to accommodate both representations, and the updating of metadata for the package as a whole.
  • If the representations could be stored separately and be their own root, contain their own descriptive metadata, they could exist independently of each other and there would be less intervention required to add new representations as there would be less metadata to update.
  • The cons here would be a duplication of metadata if descriptive metadata is stored in each representation, and if descriptive metadata is updated upstream, it would then have to be updated in two representation packages. Also, the first representation would still probably need to be updated to reflect that a sibling representation exists.

So I suppose my question is, was it ever considered that a representation could be a root, and were there other reasons other than the ones I've listed that might have accounted for the physical AIP always having an IE at its root? The more i think about it, having the physical AIP reflecting the logical AIP makes a lot of sense, and it maps to the PREMIS object types quite well. My main concern is that some archives may rely on basic tape storage which don't allow for ease of updating packages, so E-ARK might be out of reach as a result.

Best,

Kieran O'Leary
National Library of Ireland

Updating AIP if overwriting is practically impossible

Review comment: Updating AIP conforming e-ark specification: How about a case where AIPs exist in a tape library, but overwriting is practically impossible, since that action reduces the lifecycle of the tapes? Usually changes are saved as physically separate incrementals in this situation.

Support for AIP updates when ingesting SIP that refer existing AIPs

The use case is as follows:

Sometimes it is useful to resubmit a particular SIP that will replace the information placed before in a given AIP or update the representations of that AIP with additional files. For that, a repository should receive a SIP with the status UPDATE that identifies the AIP or the ID of the previous SIP submitted. The repository should be able to understand that it is an AIP UPDATE request by looking at the SIP status and identify the AIP to which the update should be performed.

In order to support this feature, the AIP spec should be updated to describe what should happen at the AIP level if an update is received via the ingest process.

@luis100 may be consulted to provide more details on current implementations.

This issue is related to DILCISBoard/E-ARK-SIP#6

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.