Code Monkey home page Code Monkey logo

contrib's Introduction

iRODS

The Integrated Rule-Oriented Data System (iRODS) is open source data management software used by research, commercial, and governmental organizations worldwide.

iRODS is released as a production-level distribution aimed at deployment in mission critical environments. It virtualizes data storage resources, so users can take control of their data, regardless of where and on what device the data is stored.

The development infrastructure supports exhaustive testing on supported platforms; plugin support for microservices, storage resources, authentication mechanisms, network protocols, rule engines, new API endpoints, and databases; and extensive documentation, training, and support services.

Core Competencies

  • iRODS implements data virtualization, allowing access to distributed storage assets under a unified namespace, and freeing organizations from getting locked in to single-vendor storage solutions.
  • iRODS enables data discovery using a metadata catalog that describes every data object, collection, and every storage resource in the iRODS Zone.
  • iRODS automates data workflows, with a rule engine framework that permits any action to be initiated by any trigger on any server or client in the Zone.
  • iRODS enables secure collaboration, so users only need to log in to their home Zone to access data hosted on a remote Zone.

History

iRODS has a 25+ year history of funded projects.

Funders have included DARPA, NSF, DOD, DOE, LC, NARA, NASA, NOAA, USPTO, and LLNL.

https://irods.org/history

License

iRODS is released under a 3-clause BSD License.

Reporting Security Vulnerabilities

See SECURITY.md for details.

Links to elsewhere...

contrib's People

Contributors

adetorcy avatar beppodb avatar donpellegrino avatar justinkylejames avatar kellerb avatar matthewturk avatar pansanel avatar paulvanschayck avatar swooshycueb avatar tempoz avatar trel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

contrib's Issues

update 3.x microservices to work with 4.2+

Docker image directory structure for volume mounting

Hi there,

I was wondering if there is a suggested way of mounting docker volume(s) to the iRODS docker container (iCat + iDrop) to keep the data persistent.

I checked the dockerfile, but couldn't figure out the directory structure of the docker image. Hoping if there is any suggestion/documentation for this.

Also, is there an up-to-date docker image we can use? From this dockerfile, it seems to be in version 4.1.3, and this image I found on DockerHub is in version 4.0.3. Not sure if the directory structure would change across versions?

Rule registry

Provide support for a registry of rules. The metadata is available in the iCAT catalog for
RULE_BASE_NAME which defines the rule set
RULE_NAME, RULE_ID, RULE_BODY, RULE_RECOVERY, RULE_INPUT_PARAMS, RULE_OUTPUT_PARAMS, RULE_DESCR_1, RULE_DESCR_2, RULE_EVENT, RULE_DOLLAR_VARS, RULE_ICAT_ELEMENTS, RULE_VERSION

Micro-services to manipulate the registry are;
msiAdmReadRulesFromFileIntoStruct,
msiAdmInsertRulesFromStructIntoDB
msiAdmRetrieveRulesFromDBIntoStruct,
msiAdmWriteRulesFromStructIntoFile

msisync_to_archive does not work for normal rods user anymore.

We did an put of a file using a normal rods user

09:35 irodstest2.storage.sara.nl:/home/robertv
robertv$ iput -R eudat rpmbuild/RPMS/noarch/irods-eudat-b2safe-dpm-client-1.0-0.noarch.rpm tokkie.rpm

09:35 irodstest2.storage.sara.nl:/home/robertv
robertv$ ils -l
/bob/home/robertv:
 ...
  robertv           0 eudat;eudatCache        16372 2015-12-03.09:35 & tokkie.rpm

There is a rule fired which replicates it to archive:

09:35 irodstest2.storage.sara.nl:/home/robertv
robertv$ iqstat -a
id     name
10437
                        #writeLine("serverLog","filePath: $filePath");
                        *CompoundRescName="eudat"
                        *CacheRescName   ="*CompoundRescName;eudatCache";
                        *ArchiveRescName ="*CompoundRescName;eudatPnfs";
                        writeLine("serverLog","Execute command to replicate (in resource *CompoundRescName) $objPath ($filePath) to *ArchiveRescName, because of put");
                        msisync_to_archive("*CacheRescName", $filePath, $objPath );
                |

This now fails if it is a normal rodsuser and NOT the rodsadmin.

Dec  3 09:37:12 pid:17810 NOTICE: writeLine: inString = Execute command to replicate (in resource eudat) /bob/home/robertv/tokkie.rpm (/var/lib/eudatCache/home/robertv/tokkie.rpm) to eudat;eudatPnfs, because of put
XXXX - last: eudatCache
XXXX - prev: eudat
XXXX - resolve
XXXX - success
XXXX - auto_repl :: off
Dec  3 09:37:12 pid:17810 NOTICE: rsDataObjRepl - Failed to replicate data object.
msisync_to_archive - fileModified failed [[-]   iRODS/server/drivers/src/fileDriver.cpp:723:fileModified :  status [CAT_INSUFFICIENT_PRIVILEGE_LEVEL]  errno [] -- message [fileModified - Failed to call modified interface.]
        [-]     libcompound.cpp:496:repl_object :  status [CAT_INSUFFICIENT_PRIVILEGE_LEVEL]  errno [] -- message [Failed to replicate the data object [/bob/home/robertv/tokkie.rpm] for operation [sync_object]]

] - [-830000]
Dec  3 09:37:12 pid:17810 ERROR: executeRuleAction Failed for msisync_to_archive status = -830000 CAT_INSUFFICIENT_PRIVILEGE_LEVEL
Dec  3 09:37:12 pid:17810 NOTICE: executeRuleBody: Microservice or Action msisync_to_archive Failed with status -830000
Dec  3 09:37:12 pid:17810 DEBUG: execMicroService3: error when executing microservice
line 6, col 3
                        msisync_to_archive("*CacheRescName", $filePath, $objPath );
                        ^

Dec  3 09:37:12 pid:17810 NOTICE: postProcRunRuleExec: exec of freq: 1h DOUBLE UNTIL SUCCESS OR 6 TIMES
Dec  3 09:37:12 pid:17810 NOTICE: modExeInfoForRepeat: rulId=10437,opStatus=-830000,nextRepeatStatus=4
Dec  3 09:37:12 pid:17810 NOTICE: Rule id 10437 set to run again at 1449135432 (frequency 2h DOUBLE UNTIL SUCCESS OR 5 TIMES. ORIGINAL TIMES=6 seconds)
Dec  3 09:37:12 pid:17810 NOTICE: Agent exiting with status = -830000

This works for the rodsadmin. Can this be fixed again?

In a previous version this worked.

Greetings,

Robert Verkerk

Encryption micro-service

Given an encryption key and a choice of encryption algorithm, a micro-service is needed to apply the encryption algorithm to a specified file, and update the file size.

new microservice to rebalance a single data object

From @rwmoore:

Current resources:

demoResc
LTLResc:passthru
└── LTLRepl:replication
    ├── LTLRenci:unixfilesystem
    └── LTLSils:unixfilesystem

I put a file on demoResc, and tried to use msiDataObjRepl() to replicate the file. Using

  • "destRescName=LTLRenci" gives an error "DIRECT_CHILD_ACCESS"
  • "destRescName=LTLRepl" gives an error "DIRECT_CHILD_ACCESS"
  • "destRescName=LTLResc" works. Two files are created, one on LTLRenci and one on LTLSils.

The problem occurs when a file is created on LTLSils through execution of a script for encryption. I want to replicate the encrypted file to LTLRenci, but cannot specify the destination resource.

What is the correct way to replicate a file that already exists on one of the resources specified by a replication node? Previously I invoked a reBalance operation on the replication node. But I only needed to reBalance a single file.

irods_audit_elk_stack container RabbitMQ test user missing

The irods_audit_elk_stack image has RabbitMQ user named test added to it when the when the image is built. The Dockerfile uses rabbitmqctl to add the user to a temporarily running RabbitMQ broker. Unfortunately, this user doesn't exist (isn't accessible?) from a container instantiated from this image.

RabbitMQ persists its data, like users, keyed to the its server's node name. By default, the node name is rabbitmq@$(hostname). When the test user is created and persisted during build time, the host name is something random, and very likely different from the host name of a container launched from the image. This causes the RabbitMQ broker running in the container to not know about the test user that was created when the image was built.

Fortunately, the node name is configurable using the NODENAME parameter in the file /etc/rabbitmq/rabbitmq-env.conf, If NODENAME is set in this file with the host name being localhost, e.g., NODENAME=rabbitmq@localhost, prior to rabbitmq-server being started during the image build, the node name will be rabbitmq@localhost when test user is created. When the container is started, the node name will be the same, and the broker will know about test user.

Here's a modified version of the RabbitMQ configuration command that sets the node name to localhost.

# Install RabbitMQ plugins and create administrator account
RUN rabbitmq-plugins enable \
        rabbitmq_amqp1_0 \
        rabbitmq_management && \
    echo 'NODENAME=rabbitmq@localhost' > /etc/rabbitmq/rabbitmq-env.conf && \
    chmod 755 /etc/rabbitmq/rabbitmq-env.conf && \
    /etc/init.d/rabbitmq-server start && \
    rabbitmqctl add_user test test && \
    rabbitmqctl set_user_tags test administrator && \
    rabbitmqctl set_permissions -p / test ".*" ".*" ".*" && \
    /etc/init.d/rabbitmq-server stop

ELK stack container: elasticsearch will not start

Elasticsearch will not start in the ELK stack container.

[2024-05-09T18:37:12,243][ERROR][o.e.b.Elasticsearch      ] [irods-elk-node] fatal exception while booting Elasticsearchorg.apache.lucene.store.AlreadyClosedException: Underlying file changed by an external force at 2024-05-09T18:37:12.237667121Z, (lock=NativeFSLock(path=/var/lib/elasticsearch/node.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],creationTime=2024-05-09T18:22:16.853023408Z))
        at [email protected]/org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:179)
        at [email protected]/org.elasticsearch.env.NodeEnvironment.assertEnvIsLocked(NodeEnvironment.java:1285)
        at [email protected]/org.elasticsearch.env.NodeEnvironment.nodeDataPaths(NodeEnvironment.java:1044)
        at [email protected]/org.elasticsearch.env.NodeEnvironment.assertCanWrite(NodeEnvironment.java:1461)
        at [email protected]/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:305)

Micro-service to add token names

The iadmin command support the addition of token names to a token namespace. An equivalent operation is needed from a micro-service
iadmin
at tokenNamespace Name [Value1] [Value2] [Value3](add token)
rt tokenNamespace Name [Value1](remove token)

User permissions for updating metadata

An ability to specify an access control on users for permission to manipulate metadata on a file is needed. This would be a lower permission than write, but higher than read.

msisync_to_archive removal from contrib

Hi,

The function msisync_to_archive is now part of iRODS 4.1.8. Will it be removed from the contrib once iRODS 4.1.8 is released? Otherwise we will have a conflict if we install both 4.1.8 and the contrib package. Which one will it use than?

Greetings,

Robert

new api plugin to list operations per plugin type

To automate the documentation of all dynamic PEPs, a list of operations, per plugin type, should be generated and sent to stdout as JSON.

  • walk API table
  • load all resources
  • get array of operations
  • dump microservice table
  • dump all plugins
    • network
    • auth
    • database
    • resource
    • api
    • microservice
    • rule engine (needs introspection operation (list_rules) )
      • not unique-ified
      • boxed by language / instance

Microservice registry

A method is needed to track the set of micro-services that are being used. The iCAT catalog has state variables defined for
MSRVC_ID, MSRVC_NAME, MSRVC_VERSION, MSRVC_LANGUAGE, MSRVC_CREATE_TIME, MSRVC_MODIFY_TIME

Associated micro-services are
msiAdmReadMSrvcsFromFileIntoStruct
msiAdmInsertMSrvcsFromStructIntoDB
msiAdmRetrieveMSrvcsFromDBIntoStruct
msiAdmWriteMSrvcsFromStructIntoFile

ELK stack container: switch to distro-provided Java runtime

Temurin was chosen as the Java runtime for the elk stack, as Temurin is the successor to AdoptOpenJDK, which tweaked the JVM in ways that were advantageous for our use case. However, with the transition to Eclipse Foundation stewardship, the goals of the project have changed, so Temurin does not actually carry forward these changes to the JVM, and is, in fact, a pretty vanilla distribution of OpenJDK.
Therefore, we may as well be using the distro-provided packages for the Java runtime. It would eliminate the need to use Adoptium's apt repository, which is behind a cloudflare gateway that sometimes gets in the way of package downloads.

hostsname being derived from $HOSTNAME doesn't work well on boot2docker

Boot2docker containers are available through an internal 192.168 network. But $HOSTNAME resolves to the Mac OSX host's name. Container environment variable $hostsname is set to $HOSTNAME, and the .groovy config file for iDrop-web is populated with this variable as a prefix. Thus, attempts to access iDrop Web fail to find the container.

Either update documentation or update how .groovy file is modified.

stop using systemd in elk stack container

With #26, our elk container will be updated to ES 8. Starting with ES 7, elasticsearch, kibana, and logstash have switched from init.d scripts to systemd unit files.
In order to get #26 done in a timely manner, we followed ubi8-init's pattern for running systemd in a docker container.
However, systemd is a bit overkill for our purposes, and using it requires passing --privileged when running the container.

irods_audit_elk_stack Dockerfile not building

Mostly got there...

Current build failure on step 33/37:

47.50 + /etc/init.d/kibana start
47.74  * Starting Kibana Server
137.3  * Kibana Server appears to be running, but healthcheck failed: Kibana responded with HTTP status 503
137.3    ...fail!
------
Dockerfile:161
--------------------
 160 |     COPY kibana/irods_dashboard.ndjson /var/lib/irods-elk/irods_dashboard.ndjson
 161 | >>> RUN ES_JAVA_OPTS="-Xms512m -Xmx512m" /etc/init.d/elasticsearch start && \
 162 | >>>     curl -sLSf -XPUT "http://localhost:9200/irods_audit?pretty=true" \
 163 | >>>         -H 'Content-Type: application/json' \
 164 | >>>         --data @/var/lib/irods-elk/irods_audit.index.json \
 165 | >>>     && \
 166 | >>>     curl -sLSf -X GET "http://localhost:9200/irods_audit/_settings?pretty=true&human=true" && \
 167 | >>>     curl -sLSf -X GET "http://localhost:9200/irods_audit/_mapping?pretty=true&human=true" && \
 168 | >>>     /etc/init.d/kibana start && \
 169 | >>>     curl -sLSf -X POST "http://localhost:5601/api/saved_objects/_import" \
 170 | >>>         -H "kbn-xsrf: true" \
 171 | >>>         --form file=@/var/lib/irods-elk/irods_dashboard.ndjson \
 172 | >>>     && echo && \
 173 | >>>     /etc/init.d/kibana stop && \
 174 | >>>     /etc/init.d/elasticsearch stop
 175 |     SHELL [ "/bin/bash", "-c" ]
--------------------
ERROR: failed to solve: process "/bin/bash -x -c ES_JAVA_OPTS=\"-Xms512m -Xmx512m\" /etc/init.d/elasticsearch start &&     curl -sLSf -XPUT \"http://localhost:9200/irods_audit?pretty=true\"         -H 'Content-Type: application/json'         --data @/var/lib/irods-elk/irods_audit.index.json     &&     curl -sLSf -X GET \"http://localhost:9200/irods_audit/_settings?pretty=true&human=true\" &&     curl -sLSf -X GET \"http://localhost:9200/irods_audit/_mapping?pretty=true&human=true\" &&     /etc/init.d/kibana start &&     curl -sLSf -X POST \"http://localhost:5601/api/saved_objects/_import\"         -H \"kbn-xsrf: true\"         --form file=@/var/lib/irods-elk/irods_dashboard.ndjson     && echo &&     /etc/init.d/kibana stop &&     /etc/init.d/elasticsearch stop" did not complete successfully: exit code: 154

ELK stack container: Investigate using GraalVM CE

Temurin was chosen as the Java runtime for the elk stack, as Temurin is the successor to AdoptOpenJDK, which tweaked the JVM in ways that were advantageous for our use case. However, with the transition to Eclipse Foundation stewardship, the goals of the project have changed, so Temurin does not actually carry forward these changes to the JVM, and is, in fact, a pretty vanilla distribution of OpenJDK.
Therefore, we may as well be using the distro-provided packages for the Java runtime. I've opened #36 to make this change, but given that performance has become a concern for the audit plugin, it still may be worth using a Java runtime more tailored to our needs. After doing a little bit of research, it appears that GraalVM CE may be the way to go.

iRODS major mode for Emacs

It would be nice to have an Emacs major mode that includes at least color syntax highlighting for the iRODS rules language. I have a first draft of this that I can contribute.

ELK stack container, docker run instruction can't find

Hello,

I tried to follow this https://github.com/irods/contrib/tree/main/irods_audit_elk_stack and build the container, but unfortunately I'm unable to find the steps how to run this docker image.

If I simply run below, it is starting all the services, but this exposed within the container. Actually i need to map the ports with the host.

docker run <image-name>

Also I'm looking for the steps how to configure the audit plugin in /etc/irods/server_config.json. I installed this pacakge "irods-rule-engine-plugin-audit-amqp" on the irods server which is running 4.3.1.

Or for the irods audit configuration, do i need to follow the steps from this article https://slides.com/irods/ugm2018-getting-started ?

Please let me know.

Thanks
Jay

Investigate swapping JDK for JRE in elk stack container

With #26, we are now using Temurin's JDK instead of the JDK/JRE provided by the distribution or the JDK/JRE bundled with elasticsearch.

The decision not to use elasticsearch's bundled JDK/JRE was made for two reasons:

  • To de-bloat the container image. Having multiple JDK/JRE installations uses a lot of space.
  • To get everything using the same JDK/JRE

Temurin was chosen over the distro-provided JDK/JRE for a few reasons:

  • The Hotspot AdoptOpenJDK flavor of JVM handles memory pressure very well.
  • The AdoptOpenJDK flavors of JVM work well in containers.
  • I've just generally had better experience with AdoptOpenJDK flavors of JVM than any others.

Given that we are no longer installing plugins in logstash, it is likely that we do not need a full JDK.
However, after the AdoptOpenJDK working group was absorbed into the Eclipse foundation and became Adoptium, JRE-only packages (and OpenJ9 packages) were no longer provided in the apt repositories.
Eclipse provides a focal-based docker image containing JRE-only Temurin 17 [Dockerfile], but it is not set up to work properly with Ubuntu's java-common system so it cannot be used.
Fortunately, Adoptium has a couple of blog posts that might help us out:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.