zimeon / ocfl-py Goto Github PK
View Code? Open in Web Editor NEWOCFL tools in Python
License: MIT License
OCFL tools in Python
License: MIT License
When using a custom storage layout extension, the configuration of the extension is placed in the storage root as:
[storage-root]/extensions/NNNN-flat-omit-prefix-storage-layout/config.json
The ocfl-validate utility flags this as an error:
[E072] OCFL storage root hierarchy include directory /extensions/NNNN-flat-omit-prefix-storage-layout with at least one file but no object declaration. Such additional files are not allowed (see https://ocfl.io/1.0/spec/#E072)
Per the spec, I believe this should be valid.
See: https://ocfl.io/1.0/spec/#storage-root-extensions
See OCFL/fixtures#91
To replicate example:
(py38) simeon@RottenApple ocfl-py> ./ocfl-validate.py extra_fixtures/1.1/good-objects/empty_fixity
INFO:ocfl.object:OCFL object at extra_fixtures/1.1/good-objects/empty_fixity is VALID
(py38) simeon@RottenApple ocfl-py> ./ocfl-validate.py extra_fixtures/1.1/good-objects/empty_fixity/inventory.json
[E038] OCFL Object standalone inventory `type` attribute has wrong value (see https://ocfl.io/1.0/spec/#E038)
INFO:ocfl.object:Standalone OCFL inventory at extra_fixtures/1.1/good-objects/empty_fixity/inventory.json is INVALID
Inventory is good, it is just v1.1
In order to be able to write up an implementation note describing how the creation of a new version may be used to update the digestAlgorithm
(OCFL/spec#430) it would be helpful to have code that implements it...
OCFL objects with an id
such as: "URN-3:HUL.DRS.OBJECT:100775205", are flagged as validation warnings. I believe these should be valid.. as URNs are valid URIs and the above URN-3 scheme is a valid URN.
See OCFL/fixtures#84 in which the v2
inventory manifest
does not include the fact that the duplicate v1/content/file-3.txt
is present with digest b3b2...
calcuated
Current code fails on OCFL/fixtures#68
ocfl-validate.py 1.0/bad-objects/E092_E093_content_path_does_not_exist
Traceback (most recent call last):
File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/fs/osfs.py", line 359, in openbin
binary_file = io.open(
FileNotFoundError: [Errno 2] No such file or directory: b'/Users/simeon/src/ocfl-py/fixtures/1.0/bad-objects/E092_E093_content_path_does_not_exist/v1/content/bonus.txt'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "../ocfl-validate.py", line 46, in <module>
if obj.validate(path,
File "/Users/simeon/src/ocfl-py/ocfl/object.py", line 525, in validate
passed = validator.validate(objdir)
File "/Users/simeon/src/ocfl-py/ocfl/validator.py", line 108, in validate
self.validate_content(inventory, all_versions, prior_manifest_digests)
File "/Users/simeon/src/ocfl-py/ocfl/validator.py", line 306, in validate_content
content_digest = file_digest(filepath, digest_type=digest_algorithm, pyfs=self.obj_fs)
File "/Users/simeon/src/ocfl-py/ocfl/digest.py", line 51, in file_digest
return _file_digest(pyfs, filename, hashlib.md5())
File "/Users/simeon/src/ocfl-py/ocfl/digest.py", line 27, in _file_digest
_fs_digest(pyfs, filename, digester)
File "/Users/simeon/src/ocfl-py/ocfl/digest.py", line 11, in _fs_digest
with pyfs.openbin(filename, 'r') as fh:
File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/fs/osfs.py", line 359, in openbin
binary_file = io.open(
File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/fs/error_tools.py", line 90, in __exit__
reraise(fserror, fserror(self._path, exc=exc_value), traceback)
File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/six.py", line 702, in reraise
raise value.with_traceback(tb)
File "/Users/simeon/.python_venv/py38/lib/python3.8/site-packages/fs/osfs.py", line 359, in openbin
binary_file = io.open(
fs.errors.ResourceNotFound: resource 'v1/content/bonus.txt' not found
Fixture OCFL/fixtures#101 has 1.1, 1.0, 1.1 spec versions for v1,2,3. ocfl-py gives a bad error message:
> ./ocfl-validate.py fixtures/1.1/bad-objects/E103_older_spec_v2
[E038a] OCFL Object v2 inventory `type` attribute has wrong value (expected https://ocfl.io/1.1/spec/#inventory, got https://ocfl.io/1.0/spec/#inventory) (see https://ocfl.io/1.0/spec/#E038)
INFO:ocfl.object:OCFL object at fixtures/1.1/bad-objects/E103_older_spec_v2 is INVALID
https://ocfl.io/draft/spec/#W010 - should be a test case in fixtures too
See OCFL/fixtures#65 which currently gets incorrectly reported as valid
From OCFL/fixtures#82 the error for this should be E017
instead of E018
The ocfl_layout.json
was removed from 1.0 because it was unclear and can easily be added back later. See OCFL/spec#481
See OCFL/fixtures#87 in which the id
in v1
does not match the id
in the root/v2
inventory.
My validator fails to report this error:
(py38) simeon@RottenApple fixtures> ../ocfl-validate.py 1.0/bad-objects/E037_inconsistent_id
INFO:ocfl.object:OCFL object at 1.0/bad-objects/E037_inconsistent_id is VALID
Both @awoods and @pwinckles have expected warnings to show by default, should make that the case and then have -q
to quiet them
While looking into OCFL/spec#532 it was really annoying to have to create an object with a bad inventory rather than just playing with a standalone inventory. It would be useful if ocfl-validate.py
validated a standalone inventory as well as object and storage roots.
An OCFL store created with the --disposition flag set is invalid, as the ocfl_layout.json still has "key" rather than "extension"
Steps to reproduce:
% ocfl-store.py --init --root my_repo --disposition pairtree
INFO:root:Created OCFL storage root my_repo
% ocfl-validate.py my_repo
INFO:root:Storage root structure is INVALID (OCFL storage root my_repo has layout file that can't be read (Storage root my_repo has layout file doesn't have required extension and description string entries))
INFO:root:Objects checked: 0 / 0 are VALID
INFO:root:Storage root my_repo is INVALID
See OCFL/spec#476
Need and fixture and code to check fixity
values that exist in prior version inventories (c.f. #40 for manifest
digest checks)
When running ocfl-validate.py
on a OCFL object, I get the following error:
FileNotFoundError: [Errno 2] No such file or directory: '/home/awoods/.local/lib/python3.7/site-packages/ocfl/data/validation-errors.json'
Is there some additional setup command I should be running in order for ocfl-py to have the entire environment it is expecting?
Following on from #79 and the spec changes in #IIIF/spec:580, need to update code to use E111 to replace E055 and allow fixity
as empty JSON object
See OCFL/fixtures#71
See OCFL/fixtures#67
See error case OCFL/fixtures#78
One of the fixtures, good-storage-roots/reg-extension-dir-root
, is not valid as expected because the storage layout doesn't conform with the declared extension.
ark:123/abc
is a4781783dcec...
not b02c71b67afe...
I think the correct storage location for ark:123/abc
is a47/817/83d/cec/ark%3a123%2fabc
Currently reported as valid:
> ./ocfl-validate.py fixtures/1.0/bad-objects/E010_missing_versions
[W010] OCFL Object v3 SHOULD have an inventory file but does not (see https://ocfl.io/1.0/spec/#W010)
INFO:ocfl.object:OCFL object at fixtures/1.0/bad-objects/E010_missing_versions is VALID
Some debate about error code OCFL/fixtures#79 but per current spec should fail
If I have an existing object, and now I want to add a new version of a content file, what method do I call? Seems like Object.add_version expects a fully-created ocfl version directory, but I just want to add a new content file, without building a new ocfl version. Or do I have to create a new ocfl version first, and then add it to the object?
Thanks.
See OCFL/fixtures#75
See OCFL/fixtures#52 . Need to update both validation code, error description and test code when fixture updated by OCFL/fixtures#58
See pyfilesystem2
branch for work to change over to use PyFilesystem for all file access. This should enable the code to work with regular OS filesystems, S3 and Zipped filesystems among others.
Current code throws and error without an E0XX
code:
ocfl-py> ./ocfl-validate.py extra_fixtures/1.1/bad-objects/E003_two_declarations
ERROR:ocfl-validate:Bad path extra_fixtures/1.1/bad-objects/E003_two_declarations (more than one 0= declaration file)
(Have added extra_fixtures/1.1/bad-objects/E003_two_declarations
in add_v1_1_support
branch, see OCFL/spec#581)
See OCFL/spec#232 which is reflected in the current spec
I am trying to create an OCFL object from a bag that was generated from some online content. I used the following command,
python3 ocfl-py/ocfl-object.py --create --srcdir 5 --id 5:5 --objdir ./ocfl_obj --name name --message hello --address [email protected]
which generated this OCFL object. The issue is that one of the files from the bag (5/data/6/5/10/OBJ.jpg
) is not being added into the corresponding directory (ocfl_obj/v1/content/data/6/5/10
) in the OCFL object.
Any guidance on why this might be or how to resolve?
Current error messages are not very clear.
I am trying to validate a simple OCFL object but I keep getting this warning and error regardless of the object I am trying to validate:
[E056a] OCFL Object root inventory includes a fixity key with value that isn't a JSON object (see https://ocfl.io/1.0/spec/#E056)
[W009] OCFL Object root inventory v1 version block user description SHOULD be a mailto: or person identifier URI (see https://ocfl.io/1.0/spec/#W009)
INFO:ocfl.object:OCFL object at ../root/test%3Acommitmsg/ is INVALID
this is the v1 inventory file in question
{
"id": "test:commitmsg",
"type": "https://ocfl.io/1.0/spec/#inventory",
"digestAlgorithm": "sha512",
"head": "v1",
"manifest": {
"0c1dc1ece20a18bd5ce7b41efe8e29e2e5710553b4716324415b26f48262d95888240d1169f6dd6dfd790dfd108aef91f6bfc99fbf2f0e54ae0d9eb75f77c054": [
"v1/content/README.md"
]
},
"versions": {
"v1": {
"created": "2021-07-07T13:55:50.992Z",
"message": "This is my message",
"user": {
"name": "myName",
"address": "[email protected]"
},
"state": {
"0c1dc1ece20a18bd5ce7b41efe8e29e2e5710553b4716324415b26f48262d95888240d1169f6dd6dfd790dfd108aef91f6bfc99fbf2f0e54ae0d9eb75f77c054": [
"README.md"
]
},
"type": ""
}
},
"fixity": null
}
I am especially confused by the W009 error since the user address is a valid mailto.
I am new to working with OCFL objects so this may be a naive issue, but does anyone have idea what is causing these issue or how to resolve them?
See OCFL/fixtures#85
Use of E039
should probably be another E025
error.
Want to tidy up all sorts of confusing errors caused because validator rejects md5
and then carries on with sha512
as the default for other checks:
(py38) simeon@RottenApple fixtures> ../ocfl-validate.py 1.0/bad-objects/E025_E001_wrong_digest_algorithm/
[E025a] OCFL Object root inventory manifest block includes a digest (d2c79c8519af858fac2993c2373b5203) that doesn't have the correct form for the sha512 algorithm (see https://ocfl.io/1.0/spec/#E025)
[E025a] OCFL Object root inventory manifest block includes a digest (fccd3f96d461f495a3bef31dc1d28f01) that doesn't have the correct form for the sha512 algorithm (see https://ocfl.io/1.0/spec/#E025)
[E039] OCFL Object root inventory `digestAlgorithm` attribute not an allowed value (got 'md5') (see https://ocfl.io/1.0/spec/#E039)
[E050d] OCFL Object root inventory v1 version state block includes a bad digest (d2c79c8519af858fac2993c2373b5203) (see https://ocfl.io/1.0/spec/#E050)
[E050d] OCFL Object root inventory v1 version state block includes a bad digest (fccd3f96d461f495a3bef31dc1d28f01) (see https://ocfl.io/1.0/spec/#E050)
[E058a] OCFL Object root inventory is missing sidecar digest file at inventory.json.sha512 (see https://ocfl.io/1.0/spec/#E058)
INFO:ocfl.object:OCFL object at 1.0/bad-objects/E025_E001_wrong_digest_algorithm/ is INVALID
Either the validator should short-circuit at finding a bad digestAlgorithm
or else make other checks more about consistency given a disallowed algorithm.
With OCFL/fixtures#95 we have a new test tree for v1.1 that can have divergent tests from v1.0. Currently ocfl-py
will barf if it gets anything that doesn't declare itself to be v1.0
See OCFL/fixtures#63
Is there any support for protecting objects from conflicting concurrent requests in this package? Or is any planned?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.