Code Monkey home page Code Monkey logo

ocfl-py's People

Contributors

awoods avatar bcail avatar zimeon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ocfl-py's Issues

Storage Root `extensions/` directory flagged as validation error

When using a custom storage layout extension, the configuration of the extension is placed in the storage root as:
[storage-root]/extensions/NNNN-flat-omit-prefix-storage-layout/config.json

The ocfl-validate utility flags this as an error:

[E072] OCFL storage root hierarchy include directory /extensions/NNNN-flat-omit-prefix-storage-layout with at least one file but no object declaration. Such additional files are not allowed (see https://ocfl.io/1.0/spec/#E072)

Per the spec, I believe this should be valid.
See: https://ocfl.io/1.0/spec/#storage-root-extensions

Fix handling of spec version in standalone inventory check

To replicate example:

(py38) simeon@RottenApple ocfl-py> ./ocfl-validate.py extra_fixtures/1.1/good-objects/empty_fixity
INFO:ocfl.object:OCFL object at extra_fixtures/1.1/good-objects/empty_fixity is VALID
(py38) simeon@RottenApple ocfl-py> ./ocfl-validate.py extra_fixtures/1.1/good-objects/empty_fixity/inventory.json
[E038] OCFL Object standalone inventory `type` attribute has wrong value (see https://ocfl.io/1.0/spec/#E038)
INFO:ocfl.object:Standalone OCFL inventory at extra_fixtures/1.1/good-objects/empty_fixity/inventory.json is INVALID

Inventory is good, it is just v1.1

Handle case of file specified in fixity that is missing

Current code fails on OCFL/fixtures#68

ocfl-validate.py 1.0/bad-objects/E092_E093_content_path_does_not_exist
Traceback (most recent call last):
  File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/fs/osfs.py", line 359, in openbin
    binary_file = io.open(
FileNotFoundError: [Errno 2] No such file or directory: b'/Users/simeon/src/ocfl-py/fixtures/1.0/bad-objects/E092_E093_content_path_does_not_exist/v1/content/bonus.txt'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "../ocfl-validate.py", line 46, in <module>
    if obj.validate(path,
  File "/Users/simeon/src/ocfl-py/ocfl/object.py", line 525, in validate
    passed = validator.validate(objdir)
  File "/Users/simeon/src/ocfl-py/ocfl/validator.py", line 108, in validate
    self.validate_content(inventory, all_versions, prior_manifest_digests)
  File "/Users/simeon/src/ocfl-py/ocfl/validator.py", line 306, in validate_content
    content_digest = file_digest(filepath, digest_type=digest_algorithm, pyfs=self.obj_fs)
  File "/Users/simeon/src/ocfl-py/ocfl/digest.py", line 51, in file_digest
    return _file_digest(pyfs, filename, hashlib.md5())
  File "/Users/simeon/src/ocfl-py/ocfl/digest.py", line 27, in _file_digest
    _fs_digest(pyfs, filename, digester)
  File "/Users/simeon/src/ocfl-py/ocfl/digest.py", line 11, in _fs_digest
    with pyfs.openbin(filename, 'r') as fh:
  File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/fs/osfs.py", line 359, in openbin
    binary_file = io.open(
  File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/fs/error_tools.py", line 90, in __exit__
    reraise(fserror, fserror(self._path, exc=exc_value), traceback)
  File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/six.py", line 702, in reraise
    raise value.with_traceback(tb)
  File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/fs/osfs.py", line 359, in openbin
    binary_file = io.open(
fs.errors.ResourceNotFound: resource 'v1/content/bonus.txt' not found

Correct handling of E103 fixture

Fixture OCFL/fixtures#101 has 1.1, 1.0, 1.1 spec versions for v1,2,3. ocfl-py gives a bad error message:

> ./ocfl-validate.py fixtures/1.1/bad-objects/E103_older_spec_v2
[E038a] OCFL Object v2 inventory `type` attribute has wrong value (expected https://ocfl.io/1.1/spec/#inventory, got https://ocfl.io/1.0/spec/#inventory) (see https://ocfl.io/1.0/spec/#E038)
INFO:ocfl.object:OCFL object at fixtures/1.1/bad-objects/E103_older_spec_v2 is INVALID

Catch inconsistent id error

See OCFL/fixtures#87 in which the id in v1 does not match the id in the root/v2 inventory.

My validator fails to report this error:

(py38) simeon@RottenApple fixtures> ../ocfl-validate.py 1.0/bad-objects/E037_inconsistent_id
INFO:ocfl.object:OCFL object at 1.0/bad-objects/E037_inconsistent_id is VALID

Support standalone inventory validation

While looking into OCFL/spec#532 it was really annoying to have to create an object with a bad inventory rather than just playing with a standalone inventory. It would be useful if ocfl-validate.py validated a standalone inventory as well as object and storage roots.

ocfl_layout.json created with invalid key "key"

An OCFL store created with the --disposition flag set is invalid, as the ocfl_layout.json still has "key" rather than "extension"

Steps to reproduce:

% ocfl-store.py --init --root my_repo --disposition pairtree
INFO:root:Created OCFL storage root my_repo
% ocfl-validate.py my_repo
INFO:root:Storage root structure is INVALID (OCFL storage root my_repo has layout file that can't be read (Storage root my_repo has layout file doesn't have required extension and description string entries))
INFO:root:Objects checked: 0 / 0 are VALID
INFO:root:Storage root my_repo is INVALID

Setup expectations for ocfl-py

When running ocfl-validate.py on a OCFL object, I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: '/home/awoods/.local/lib/python3.7/site-packages/ocfl/data/validation-errors.json'

Is there some additional setup command I should be running in order for ocfl-py to have the entire environment it is expecting?

good-storage-roots/reg-extension-dir-root is not valid

One of the fixtures, good-storage-roots/reg-extension-dir-root, is not valid as expected because the storage layout doesn't conform with the declared extension.

  • the sha256 of the object id ark:123/abc is a4781783dcec... not b02c71b67afe...
  • the declared layout extension requires lowercase hex strings but uppercase are used.

I think the correct storage location for ark:123/abc is a47/817/83d/cec/ark%3a123%2fabc

Catch error case E010_missing_versions

Currently reported as valid:

> ./ocfl-validate.py fixtures/1.0/bad-objects/E010_missing_versions
[W010] OCFL Object v3 SHOULD have an inventory file but does not (see https://ocfl.io/1.0/spec/#W010)
INFO:ocfl.object:OCFL object at fixtures/1.0/bad-objects/E010_missing_versions is VALID

Some debate about error code OCFL/fixtures#79 but per current spec should fail

Question: how do I add a new version of a content file to an existing object?

If I have an existing object, and now I want to add a new version of a content file, what method do I call? Seems like Object.add_version expects a fully-created ocfl version directory, but I just want to add a new content file, without building a new ocfl version. Or do I have to create a new ocfl version first, and then add it to the object?

Thanks.

Handle case of multiple conformance declarations more gracefully

Current code throws and error without an E0XX code:

ocfl-py> ./ocfl-validate.py extra_fixtures/1.1/bad-objects/E003_two_declarations
ERROR:ocfl-validate:Bad path extra_fixtures/1.1/bad-objects/E003_two_declarations (more than one 0= declaration file)

(Have added extra_fixtures/1.1/bad-objects/E003_two_declarations in add_v1_1_support branch, see OCFL/spec#581)

Missing file when creating OCFL object from directory

I am trying to create an OCFL object from a bag that was generated from some online content. I used the following command,

python3 ocfl-py/ocfl-object.py --create --srcdir 5 --id 5:5 --objdir ./ocfl_obj --name name --message hello --address [email protected]

which generated this OCFL object. The issue is that one of the files from the bag (5/data/6/5/10/OBJ.jpg) is not being added into the corresponding directory (ocfl_obj/v1/content/data/6/5/10) in the OCFL object.
Any guidance on why this might be or how to resolve?

Persistent errors for user description.

I am trying to validate a simple OCFL object but I keep getting this warning and error regardless of the object I am trying to validate:

[E056a] OCFL Object root inventory includes a fixity key with value that isn't a JSON object (see https://ocfl.io/1.0/spec/#E056)
[W009] OCFL Object root inventory v1 version block user description SHOULD be a mailto: or person identifier URI (see https://ocfl.io/1.0/spec/#W009)
INFO:ocfl.object:OCFL object at ../root/test%3Acommitmsg/ is INVALID

this is the v1 inventory file in question

{
    "id": "test:commitmsg",
    "type": "https://ocfl.io/1.0/spec/#inventory",
    "digestAlgorithm": "sha512",
    "head": "v1",
    "manifest": {
        "0c1dc1ece20a18bd5ce7b41efe8e29e2e5710553b4716324415b26f48262d95888240d1169f6dd6dfd790dfd108aef91f6bfc99fbf2f0e54ae0d9eb75f77c054": [
            "v1/content/README.md"
        ]
    },
    "versions": {
        "v1": {
            "created": "2021-07-07T13:55:50.992Z",
            "message": "This is my message",
            "user": {
                "name": "myName",
                "address": "[email protected]"
            },
            "state": {
                "0c1dc1ece20a18bd5ce7b41efe8e29e2e5710553b4716324415b26f48262d95888240d1169f6dd6dfd790dfd108aef91f6bfc99fbf2f0e54ae0d9eb75f77c054": [
                    "README.md"
                ]
            },
            "type": ""
        }
    },
    "fixity": null
}

I am especially confused by the W009 error since the user address is a valid mailto.
I am new to working with OCFL objects so this may be a naive issue, but does anyone have idea what is causing these issue or how to resolve them?

Improve error reporting for the case of bad digestAlgorithm

See OCFL/fixtures#85

Use of E039 should probably be another E025 error.

Want to tidy up all sorts of confusing errors caused because validator rejects md5 and then carries on with sha512 as the default for other checks:

(py38) simeon@RottenApple fixtures> ../ocfl-validate.py 1.0/bad-objects/E025_E001_wrong_digest_algorithm/
[E025a] OCFL Object root inventory manifest block includes a digest (d2c79c8519af858fac2993c2373b5203) that doesn't have the correct form for the sha512 algorithm (see https://ocfl.io/1.0/spec/#E025)
[E025a] OCFL Object root inventory manifest block includes a digest (fccd3f96d461f495a3bef31dc1d28f01) that doesn't have the correct form for the sha512 algorithm (see https://ocfl.io/1.0/spec/#E025)
[E039] OCFL Object root inventory `digestAlgorithm` attribute not an allowed value (got 'md5') (see https://ocfl.io/1.0/spec/#E039)
[E050d] OCFL Object root inventory v1 version state block includes a bad digest (d2c79c8519af858fac2993c2373b5203) (see https://ocfl.io/1.0/spec/#E050)
[E050d] OCFL Object root inventory v1 version state block includes a bad digest (fccd3f96d461f495a3bef31dc1d28f01) (see https://ocfl.io/1.0/spec/#E050)
[E058a] OCFL Object root inventory is missing sidecar digest file at inventory.json.sha512 (see https://ocfl.io/1.0/spec/#E058)
INFO:ocfl.object:OCFL object at 1.0/bad-objects/E025_E001_wrong_digest_algorithm/ is INVALID

Either the validator should short-circuit at finding a bad digestAlgorithm or else make other checks more about consistency given a disallowed algorithm.

Add support for OCFL 1.1

With OCFL/fixtures#95 we have a new test tree for v1.1 that can have divergent tests from v1.0. Currently ocfl-py will barf if it gets anything that doesn't declare itself to be v1.0

concurrency/locking

Is there any support for protecting objects from conflicting concurrent requests in this package? Or is any planned?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.