Code Monkey home page Code Monkey logo

pln's Introduction

PKP Preservation Network Plugin for OJS

OJS compatibility GitHub release GitHub code size in bytes License type Number of downloads Commit activity per year Contributors

About

This plugin provides a means for OJS to preserve content in the PKP Preservation Network (PKP PN). The plugin checks for new and modified content and provided the PN's terms of use are met, will communicate with the PN's staging server to preserve your published content automatically.

If you need support for older OJS releases, see the available branches.

Installation Instructions

We recommend installing this plugin using the Plugin Gallery within OJS. Log in with administrator privileges, navigate to Settings > Website > Plugins, and choose the Plugin Gallery. Find the PN Plugin there and install it.

If for some reason, you need to install it manually, there are two ways:

  • Download the latest release (attention to the OJS version compatibility)
  • Download the code from GitHub (attention to grab the code from the right branch), then run composer install at the main plugin folder.

After downloading, create the folder plugins/generic/pln and place the plugin files in it.

Run the command php lib/pkp/tools/installPluginVersion.php plugins/generic/pln/version.xml at the main OJS folder, this will ensure the plugin is installed/upgraded properly (e.g. new fields might be added to the database).

After installing and enabling the plugin, you need to access its settings:

image

Then read and accept all terms of use, and click on the save button:

image

License

This plugin is licensed under the GNU General Public License v3. See the file LICENSE for the complete terms of this license.

System Requirements

  • OJS 3.5.0 or greater.
  • CURL support for PHP.
  • ZipArchive support for PHP.

Note

The primary difference between this plugin and the existing LOCKSS preservation mechanism present in OJS is the PN requires no registration or involvement with the network - as long as you agree with the network's terms of use, you can preserve your journal's content.

Contact/Support

If you have issues, please use the PKP support forum (https://forum.pkp.sfu.ca/c/questions/5), the issues tracker (https://github.com/pkp/pln/issues) is reserved for triaged issues.

Setting up the deposit server

By default, the plugin deposits to https://pkp-pn.lib.sfu.ca. Journal managers can change the URL on the plugin settings page. The default URL can also be set in the OJS config.inc.php file by adding this configuration:

; Change the default Preservation Network URL
[lockss]
pln_url = https://example.com

You will need to clear the data caches after adding or changing this setting. There is a link to clear the caches at Site Administration > Administration

Build Instructions

(These instructions are only necessary if you are working with the plugin manually. If you are installing the plugin using the Plugin Gallery, they are not necessary.)

  • Clone the repository containing the code.
  • Run OJS's php tools/upgrade.php upgrade
  • Execute composer install from console, being in the cloned pln folder. (This process is going to produce a vendor folder containing the depending library.)
  • Enable the PN plugin

Other useful hints / Troubleshooting hints

  • The plugin depends on 2 database tables: pln_deposits and pln_deposit_objects. If those tables are not present in your database, it means the plugin wasn't installed properly, refer to the previous sections for help.

  • Ensure the plugin is creating daily log files at the scheduledTaskLogs folder within the OJS files directory. Files named as PKPPLNDepositorTask-*id*-*datestamp* should be present. If absent, the task is probably not being executed daily or there might be permission issues to create them.

  • Make sure that crontab or web based task scheduler is enabled and properly configured. To configure the crontab in a *nix system, add the following lines to crontab * * * * * php lib/pkp/tools/scheduler.php run >> /dev/null 2>&1 . To configure the web based task sechdule runner, make sure that task_runner is set to On in application's config.inc.php file.

  • Every log file should end with an entry like [*date time*] [Notice] Task process stopped.. If absent, it means the process has been halted unexpectedly due to errors, check the server/PHP error log for more information.

  • If an issue fails to be packaged, try to export it through the Native XML plugin at the Tools > Import/Export, which is supposed to display some hints about what went wrong.

  • Whenever something doesn't work as expected, always check the error log for clues. If nothing helps, report your problem in the forum.

Original authors

pln's People

Contributors

jonasraoni avatar defstat avatar asmecher avatar ckamburov avatar josekarvalho avatar jirrka avatar jordilacruz avatar bsvvi avatar tigran54 avatar primoz-svetek avatar ajnyga avatar diegojmacedo avatar ppv1979 avatar cjwetherington avatar teismann avatar t-fildishevska avatar bibliothekswelt avatar neffe avatar vormia avatar kant avatar alexandrafo avatar drugurkocak avatar touhidurabir avatar saalam avatar pilasou avatar mooselybased avatar osmndrmz avatar shabilullah avatar mirkospiroski avatar mpbraendle avatar

Stargazers

ØyvindG avatar  avatar Mariusz Sienkiewicz avatar

Watchers

 avatar James Cloos avatar  avatar

pln's Issues

Retry the deposits automatically

Some deposits, that might have failed due to local issues (harvest-error, xml-error, bag-error, ...) might be retried automatically after upgrading the application.

  • Store the application version when creating the package
  • If a deposit failed with a recoverable error, attempt to resubmit automatically once the application has been upgraded
  • Existing deposits with a recoverable might skip the version check if the field is empty

Class 'APP\notification\Notification' not found

After installing the plugin manually according to instructions, plugin gallery stuck at loading.
From the console I'm getting this error:

"GET https://myjournal.com/index.php/tech/$$$call$$$/grid/settings/plugins/settings-plugin-grid/fetch-grid?_=1655654133139 500"
And from php logs I see this:

"PHP message: PHP Fatal error:  Uncaught Error: Class 'APP\notification\Notification' not found in /var/www/vhosts/myjournal.com/httpdocs/plugins/generic/pln-main/PLNPlugin.inc.php:72
Stack trace:
#0 /var/www/vhosts/myjournal.com/httpdocs/plugins/generic/pln-main/index.php(13): require_once()
#1 /var/www/vhosts/myjournal.com/httpdocs/lib/pkp/classes/plugins/PluginRegistry.inc.php(241): include('/var/www/vhosts...')
#2 /var/www/vhosts/myjournal.com/httpdocs/lib/pkp/classes/plugins/PluginRegistry.inc.php(126): PluginRegistry::_instantiatePlugin('generic', 'plugins/generic', 'pln-main')
#3 /var/www/vhosts/myjournal.com/httpdocs/lib/pkp/classes/controllers/grid/plugins/PluginGridHandler.inc.php(155): PluginRegistry::loadCategory('generic')
#4 /var/www/vhosts/myjournal.com/httpdocs/controllers/grid/settings/plugins/SettingsPluginGridHandler.inc.php(36): PluginGridHandler->loadCategoryData(Object(Request), 'generic', Array)
#5 /var/www/vhosts/myjournal.com/httpdocs/lib/pkp/classes/controllers/grid/CategoryGridHa" while reading response header from upstream, client: 95.91.235.178, server: myjournal.com, request: "GET /index.php/tech/$$$call$$$/grid/settings/plugins/settings-plugin-grid/fetch-grid?_=1655648407476 HTTP/2.0", upstream: "fastcgi://unix:/var/www/vhosts/system/myjournal.com/php-fpm.sock:", host: "myjournal.com", referrer: "https://myjournal.com/index.php/tech/management/settings/website"

What can be the cause of this error?

Plugin is unable to cleanup after journal is removed

There's some code to cleanup data here:

HookRegistry::register('JournalDAO::deleteJournalById', array($this, 'callbackDeleteJournalById'));

But it depends on a non-existent hook, and at the administrative part (/admin/contexts) where the journal is supposed to be removed, standard plugins are not loaded, so there's no way to listen for the right hook (Context::delete).

The cleanup might be done inside the scheduled task.
Another possibility would be to investigate the behavior of enabling the "site-wide" setting, which would be the only way to get the plugin loaded at the right time.

Error when trying to process removed issue

If an issue is removed from the journal before being uploaded to the staging server, it remains in the deposit list of the plugin, even though, it's never going to be finished.

  • Given that it's not possible to remove an issue from the Preservation Network, I think it might make sense to keep removed issues in the list of deposits of the plugin.
  • For this specific case, I think it makes sense to remove the item, or at least setup a more helpful error message, to let the user know what has blocked the deposit.

Ensure the deposit URL is synchronized

If a journal deposited a package using the URL https://test-domain-abc, got a harvest error, then moved to https://test-domain-xyz, then it's needed to resend it.

Perhaps a variation of this issue: #22

Deposits incorrectly displayed as completed

A forum user, which had a private (behind login) journal, had two deposits appearing as completed, even though they were never sent to the staging server.
Once he setup the journal to be public, and resent the deposits, the situation was normalized.

Given the number of occurrences, it makes sense to attempt an "auto-fix", basically re-check the status of every deposit against the staging server, and also to do some sanity checks (e.g. check if the package exists before depositing).

Source:
https://forum.pkp.sfu.ca/t/ojs-3-3-0-10-pkp-pn-bad-request-error-take-2/73286/11
https://forum.pkp.sfu.ca/t/status-pendente-plugin-pkp-pn/76262
https://forum.pkp.sfu.ca/t/pkp-pn-preservation/76776/10

Improve the status and error handling/notification

Looks like users have no clue about what's happening in the plugin, if there's something missing to make it work, last time it was executed, if it's being executed perfectly (and if not, they can't see the errors), etc.

Part of this task should be probably synchronized with the pkppln repository, which could probably forward some helpful (and not compromising) error messages/hints, e.g. "failed to collect the deposit due to a request failure accessing the URL xxx".

We can also try to add some helpful instructions.

The main objective is to decrease the noise with error reports from users.

PHP warning about missing autoload.php file

PHP Warning: require_once(/ojs/journal1/plugins/generic/pln/classes/../vendor/autoload.php): Failed to open stream: No such file or directory in /ojs/journal1/plugins/generic/pln/classes/DepositPackage.php on line 234

I see this error repeated in the Apache logs, and I think it's just a path issue with the plugin not finding autoload.php on line https://github.com/pkp/pln/blob/stable-3_4_0/classes/DepositPackage.php#L234 when it tries to create a package. I'm not sure if it's a big deal, but I thought I would mention it.

The journal is on OJS version 3.4.0-5 with version v3_0_0-0 of the PKP|PN plugin and PHP 8.1.

Improve handling of big deposits

The Preservation Network has a hard limit regarding the deposit size, which is currently 1GB, given that some deposits exceed the current 1GB threshold, it makes sense to find a solution to accommodate them.

Solution
Adding support for depositing individual submissions sounds like a good solution. It should substantially decrease the size of deposits, while also making room for the preservation of preprints and continuous publications.
The issue/volume number and the publication order within the issue can be attached as metadata.

Other improvements/ideas

  • Update the native plugin to upload only the latest revision of the submission
  • Allow the user to setup rules (e.g. ignore video files, files bigger than X, remove biggest files until it fits, etc), which might be applied when the deposit exceeds the limit
  • Allow the user to manually select which files/submissions he wants to skip
  • Breaking the package in pieces is probably not doable (if it is, it will require some research)

Remove the PEAR/Archive_Tar requirement

Some users have been complaining in the forum about a missing requirement "Your system must have a tar executable".
Given the package whikloj/bagittools already includes the PEAR package by itself composer, we don't need to check this requirement and the message might be removed to avoid confusions.

PRs

Trigger resubmission when a submission has been removed from the issue

At this moment, when a submission is removed from an issue, the code is not able to detect the change and resubmit the issue.

The code detects changes to the issue by comparing an internal last modified date to the last modified date of the issue and submissions assigned to it. If a submission is removed from the issue, the last modified date of the issue isn't updated, and the submission is also not checked anymore, we don't keep track of it.

  • It makes sense to include all possible "last modified dates" to the existing code.
  • Listening for publish/unpublish events is an option, but listening for hooks that happen before inserting/updating sounds like a better option
  • The application (for now just OJS) can be updated to reflect such updates in a better way.

Decrease the disk space consumption

At this moment, the generated deposit is kept at the user installation until the preservation is completed.
The plugin must release the local files as soon as it gets a confirmation that the deposit has been harvested.

Missing mapping for the processing state: harvest-error

When there's a failure in the harvesting process at the staging server, the plugin shows a "Unknown processing state harvest-error" message.

The error isn't unknown, so the message can be improved to state that perhaps there was a communication error (the staging server was unable to access the journal to retrieve the deposit).

PHP 8.0.25 complains about non-static method getting called statically

There are errors in the plugin cron jobs that look related to using PHP 8.0. We're on the most recent version of the plugin (2.0.4-2).

PHP Fatal error: Uncaught Error: Non-static method PKPApplication::getRequest() cannot be called statically in /ojs/plugins/generic/pln/classes/DepositPackage.inc.php:133\nStack trace:\n#0 /ojs/plugins/generic/pln/classes/DepositPackage.inc.php(556): DepositPackage->generateAtomDocument()\n#1 /ojs/plugins/generic/pln/classes/tasks/Depositor.inc.php(225): DepositPackage->packageDeposit()\n#2 /ojs/plugins/generic/pln/classes/tasks/Depositor.inc.php(127): Depositor->_processNeedPackaging(Object(Journal))\n#3 /ojs/lib/pkp/classes/scheduledTask/ScheduledTask.inc.php(146): Depositor->executeActions()\n#4 /ojs/lib/pkp/plugins/generic/acron/PKPAcronPlugin.inc.php(258): ScheduledTask->execute()\n#5 [internal function]: PKPAcronPlugin->shutdownFunction()\n#6 {main}\n thrown in /ojs/plugins/generic/pln/classes/DepositPackage.inc.php on line 133

Fatal error when running under MariaDB

The error is happening at the last step, where the plugin does some cleanup (remove not existing issues/deposits), which should be not needed anymore after adding foreign keys to the entities on the +3.4 releases.

Source: https://forum.pkp.sfu.ca/t/pkp-pn-impossible-to-deposit-most-of-our-issues/88989/3

Fatal error:  Uncaught PDOException: SQLSTATE[42000]: Syntax error or access violation: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'as `do` where `do`.`journal_id` not in (select `j`.`journal_id` from `journal...' at line 1 in /var/www/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php:566
Stack trace:
#0 /var/www/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(566): PDO->prepare()
#1 /var/www/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(753): Illuminate\\Database\\Connection->Illuminate\\Database\\{closure}()
#2 /var/www/lib/pkp/lib/vendor/laravel/framework/src/Illuminate/Database/Connection.php(720): Illuminate\\Database\\Connection->runQueryCallback()
#3 /var/www/lib/pkp/lib/vendor/lar...'

Missing notifications/invalid call to Plugin::manage()

Even though this isn't going to be called, the call is using invalid variables/arguments.

return parent::manage($verb, $args, $message, $messageParams);

I guess this was supposed to fire a notification, so as part of this task, the notification might be implemented properly.

At this moment, the plugin notifications are unhelpful/annoying, due to a bug where the text content is absent.

Source:
https://forum.pkp.sfu.ca/t/null-notifications-what-are-those-supposed-to-be/64832

Investigate if the plugin is responsible for empty issues

Recent deposits received by the staging server, using the latest plugin release (2.0.4-2) and different OJS versions, ended up uploading an empty XML issue (basically it just had the <?xml version="1.0"?>).

OJS versions that appeared in the list: 3.3.0.3, 3.3.0.5, 3.3.0.6, 3.3.0.7, 3.3.0.8, 3.3.0.9, 3.3.0.10
There are ~800 deposits with this issue (around 10% of the total of deposits from those OJS versions), the first arrived on 2021-07-29 (which matches the latest release date of the plugin).

It might be helpful to check if there's a journal which we have access to in the list to simulate the issue.

Extra reference:
https://forum.pkp.sfu.ca/t/pkp-pn-plugin-error-posible-solution/71944/11

Open discussion about deposit limits

  • The current deposit limit is 1GB.
  • It's needed to check what's the internal reasoning for the limit. Is it an anti-abuse value? An old file system limit? External limitation (e.g. LOCKSS)? ...
  • If LOCKSS is unable to handle large files, should we handle this condition (e.g. break the package in pieces)?

And also important: deposits which are larger than the limit, shouldn't be sent.

Improve the status interface

At this moment, it's cumbersome to understand if and which deposits succeeded/failed, there's no filter, sorting nor statistics.

Add resilience to the deposit task

At this moment, any failure on the process leads to an interruption of the task.

We should skip the bad issues and export whatever we can.
Given that skipping might fail under fatal errors, a simpler workaround would be to set a "processed date" and sort the pending deposits by this date.

Adding a new database field is a blocker for the stable branch, this way the export_deposit_error might be with the same purpose.

Test journals can overwrite information in the PKP PN server

Describe the bug
If a user setup a clone of the journal for testing purposes, using another domain, the behavior of the staging server/plugin seems to be a little permissive/undefined, as it ends up overwriting the backend information with the data from the test domain.
Following the same idea, it might probably be also willing to ingest undesired/test content.

To Reproduce
Spin up a clone of the journal in another domain, and once it tries to contact the staging server, it will end up overwriting the information in the administrative panel (e.g. journal's URL).

Notes

  • We can detect URL changes, but it's not up to us to say which one refers to the production journal.
  • In order to ensure a journal administrator has control over the journal we might send a "beacon" to it from the PKP staging server (something similar to the way Google validates your login by asking permission from another device), just always good to ensure this can't be abused.
  • Overwriting the main host must be an incisive action (e.g. "The PKP PN already know about your journal, but under a different URL, would you like to update it to use the URL xyz?")
  • The plugin could offer an option to disable itself, once the user flagged the journal is using a test domain.
  • Given that users might clone a journal instead of creating a new instance from zero, perhaps it's useful to offer an option to also reset the PKP PN GUID
  • Check if it makes sense to have a list of acceptable URLs in the PKP PN backend (probably not)
  • Check if PKP PN backend is too permissive (accepts deposits from any domain), and make it stricter (once we assure the user has been using a newer plugin/protocol version)

What application are you using?
OJS 3.3

Additional information
https://forum.pkp.sfu.ca/t/problem-the-pkp-pln-does-not-know-about-this-journal-yet/72678/17

Database is missing a field

Users that had the plugin before the introduction of a new field ended up with a broken plugin after the upgrade.

Source: https://forum.pkp.sfu.ca/t/pkp-pn-behaves-strange-in-version-3-1-2-4/67420/20

Requirements

  • Make a list of all added/removed fields across the plugin releases, and ensure the database is updated (there was just one case, the field was also moved from one table to another)
  • As there's no upgrade/migration for plugins, a possible solution to avoid re-checking if the database structure is healthy is to create a flag/lock file.
  • Backport the fix to the broken branches (anything after that commit)

PRs

Drop "PublishedArticle" from the getObjectType()

According to comments in the code, it's a "Legacy (OJS pre-3.2)", therefore a migration can be written to get rid of this value in the database and its related handling in the code.

As part of this task, foreign keys might be added. The cascate rule for the delete should be probably set null, to avoid blocking removals, while also allowing the plugin to do its cleanup.

Display the GUID instead of the local ID

The UI displays the local deposit ID, which isn't useful for us, and also for the user.

Given the table has a GUID, we can discard the auto-increment ID and display it, which will be useful for locating a specific deposit at the staging server.

Drop support for depositing individual submissions

Besides depositing full issues, the plugin also supports depositing individual submissions, which depends on the value of the object_type setting.

Problem:

  • The setting isn't configurable, and its default value is to deposit issues. Therefore, it's basically alive just to support a legacy workflow, which will end up not being well tested and lead to zombie code.
  • The current implementation attempts to group submissions in groups of 20 for each deposit. If the threshold isn't met, the deposit will never be created.

Notes:

  • At the Preservation Network there are currently 571 deposits out of 137790 (0.4%), that perhaps are using this setting (they have neither an issue nor a volume number).
  • Dropping this code will also allow merging the database entities of the plugin into a single one.
  • This task should also cover the #63

Enable external access to the plugin URLs for private journals

Describe the problem you would like to solve
When a journal is protected by login or non-public (e.g. perhaps preparing to retire), the staging server is unable to request the URL gateway/plugin/PLNGatewayPlugin

Describe the solution you'd like
Allow any custom URL of the plugin to passthrough.

PRs

Who is asking for this feature?
Forum: https://forum.pkp.sfu.ca/t/ojs-3-3-0-10-pkp-pn-bad-request-error/73058/8

Review/refactoring meta issue

  • Review user issues in the forum
  • Review the "Refresh" button: perhaps it can happen automatically
  • Replace serialize/unserialize by JSON
  • Remove deprecated code (Config::getVar('i18n', 'client_charset'), AppLocale, etc)
  • Attempt to use the newer jobs module and get rid of Acron/TaskScheduler checks
  • Add auto-formatting for the code
  • include_once('Archive/Tar.php') not needed (#25)
  • Use namespaces
  • Use PHP 8 features
  • Replace the binary flags by fields: it's complex to view/deal with, better to get it replaced by something simpler (e.g. state machine)
  • Remove non-visited code branches
  • Attempt to use events to detect modifications on the issues/submissions (useful only if the code which looks for modifications is heavy)
  • Review states by checking the ones available in pkppln and locks-o-matic: (#20 + #18)
  • Remove non-used locale keys
  • Update README with better instructions (how-to images, troubleshooting, etc) and link/reuse the content inside the plugin
  • Rename PLN to PN (renaming the folder might be problematic)
  • The forum is often orientating users to reset deposits. Sometimes it doesn't make sense, thus this feature should be better controlled.

OJS User - Author - "Country" field

OJS 3.3.0.10
When creating a user who is "Author" (which is unlike all other user roles), the "Country" field is mandatory.
When it is empty, we get the message: "This field is required."
However, in rare cases where a journal or author is not interested in using this field, it should be left blank.
How can we fix this problem?

Plugin installed but not visible

Dear all,

I have a problem with the plugin. Our OJS 3.1.2 says the plugin (v2.0.1.1) is installed but it is not visible in the Installed plugins tab at all. I also see an Archiving tab where when I click enable PKP PN, it refreshes and nothing happens.

Can you assist to resolve the issue?

Plugin is exporting failed exports

The plugin is currently proceeding with the export, even when a known error happened at the "packaging" process.

Simulation

  • Publish an issue
  • Remove physically one submission file (or any other technique to break the native xml plugin export)
  • Run the scheduled task for the PKP PN
  • The issue will be exported

Add tests

Given the amount of issues/noise an error in the plugin might create, it's better to ensure we're testing the basic plugin's functionality.

Requirements:

  • Mock API of the staging server
  • Ensure bad states coming from the API are also covered

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.