Code Monkey home page Code Monkey logo

Comments (25)

Craigacp avatar Craigacp commented on May 16, 2024 1

I've updated PR #197 with the fix discussed in opensearch-project/OpenSearch#1649.

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024 1

The fix has ben merged into main and we've released v4.1.1 today. It'll take several days to get it on Maven Central, there's an internal process we need to go through as we don't have automatic deployments setup yet.

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024 1

4.2.0 has been tagged, and both 4.1.1 and 4.2.0 are now available in Maven Central. The search indices are slowly updating, but the jars themselves are available.

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Is there a specific security.policy file that triggers this or is it any installation of a SecurityManager?

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Also can you paste the stack trace you get? Tribuo doesn't use the default ForkJoinPool anywhere so I'm surprised we're hitting this issue.

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

I ran a quick check and ended up with this stack trace, which is what I assume you're getting.

java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "modifyThread")

	at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
	at java.base/java.security.AccessController.checkPermission(AccessController.java:897)
	at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:322)
	at java.base/java.util.concurrent.ForkJoinPool.checkPermission(ForkJoinPool.java:667)
	at java.base/java.util.concurrent.ForkJoinPool.<init>(ForkJoinPool.java:2320)
	at java.base/java.util.concurrent.ForkJoinPool.<init>(ForkJoinPool.java:2165)
	at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:197)
	at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:298)

It's documented that the ForkJoinPool needs this permission in the javadoc for it's constructor (https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/ForkJoinPool.html#%3Cinit%3E(int)) so this isn't unexpected behaviour. The solution is to modify the security policy to grant Tribuo's jar the following permission:

grant codebase "file:/path/to/tribuo-clustering-kmeans-4.1.0.jar" {
        permission java.lang.RuntimePermission "modifyThread";
};

We'll update the docs to note that this permission is necessary when running with a security manager.

For single threaded operation we could potentially modify the code not to use a ForkJoinPool, but it would require extensive refactoring to migrate multithreaded use over to a ThreadPoolExecutor and the performance may suffer.

Some of the modes of KNNModel also use a ForkJoinPool, we'll add a note to the docs there as well. There are modes of that model which use a regular executor service so those should be unaffected.

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

@Craigacp Thanks a lot! BTW, are you running on Java 8?

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

I think there are deployments on 8 and 11, and we run the CI on 8, 11 and 17. Not sure if any deployments are using a SecurityManager though.

Did granting that privilege fix your issue?

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

No, this is the stack trace

java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "modifyThreadGroup")
        at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472) ~[?:?]
        at java.security.AccessController.checkPermission(AccessController.java:1036) ~[?:?]
        at java.lang.SecurityManager.checkPermission(SecurityManager.java:408) ~[?:?]
        at org.opensearch.secure_sm.SecureSM.checkThreadGroupAccess(SecureSM.java:205) ~[opensearch-secure-sm-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.opensearch.secure_sm.SecureSM.checkAccess(SecureSM.java:160) ~[opensearch-secure-sm-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at java.lang.ThreadGroup.checkAccess(ThreadGroup.java:311) ~[?:?]
        at java.lang.Thread.<init>(Thread.java:421) ~[?:?]
        at java.lang.Thread.<init>(Thread.java:707) ~[?:?]
        at java.lang.Thread.<init>(Thread.java:540) ~[?:?]
        at java.util.concurrent.ForkJoinWorkerThread.<init>(ForkJoinWorkerThread.java:100) ~[?:?]
        at java.util.concurrent.ForkJoinPool$DefaultForkJoinWorkerThreadFactory$1.run(ForkJoinPool.java:720) ~[?:?]
        at java.util.concurrent.ForkJoinPool$DefaultForkJoinWorkerThreadFactory$1.run(ForkJoinPool.java:717) ~[?:?]
        at java.security.AccessController.doPrivileged(AccessController.java:391) ~[?:?]
        at java.util.concurrent.ForkJoinPool$DefaultForkJoinWorkerThreadFactory.newThread(ForkJoinPool.java:716) ~[?:?]
        at java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1329) ~[?:?]
        at java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1353) ~[?:?]
        at java.util.concurrent.ForkJoinPool.signalWork(ForkJoinPool.java:1478) ~[?:?]
        at java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:1912) ~[?:?]
        at java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:1930) ~[?:?]
        at java.util.concurrent.ForkJoinPool.submit(ForkJoinPool.java:2506) ~[?:?]
        at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:189) ~[tribuo-clustering-kmeans-4.0.2.jar:?]
        at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:253) ~[tribuo-clustering-kmeans-4.0.2.jar:?]

But looks like caused by org.opensearch.secure_sm.SecureSM, will take a look at that part

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Do you have the part of the stack trace which has a Tribuo class in it?

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

I looked into the SecureSM class a little bit, and that notes that it has tighter thread group restrictions than the default SecurityManager - https://github.com/opensearch-project/OpenSearch/blob/main/libs/secure-sm/src/main/java/org/opensearch/secure_sm/SecureSM.java#L62. That's presumably why I didn't see this when I was testing before as I didn't realise ElasticSearch/OpenSearch had it's own implementation of SecurityManager.

Try giving the Tribuo jar modifyThreadGroup too:

grant codebase "file:/path/to/tribuo-clustering-kmeans-4.1.0.jar" {
        permission java.lang.RuntimePermission "modifyThread";
        permission java.lang.RuntimePermission "modifyThreadGroup";
};

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

Do you have the part of the stack trace which has a Tribuo class in it?

Updated the original comment by adding the Tribuo class part. We are using
compile group: 'org.tribuo', name: 'tribuo-clustering-kmeans', version: '4.0.2'

Try giving the Tribuo jar modifyThreadGroup too:

I tried, not work. Like @spbjss said "the security policies doesn’t get propagated to ForkJoinPool worker threads if a SecurityManager is present"

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Well, I've managed to get it to work using Tribuo 4.0.2 and the built in Java SecurityManager. The policy needs to be granted to all the code in the stack trace as the FJP creation isn't inside AccessController.doPrivileged. I successfully ran one of the unit tests using the following policy (saved as tribuo.policy):

grant codeBase "file:/path/to/tribuo/test/classes" {
        permission java.lang.RuntimePermission "modifyThread";
        permission java.lang.RuntimePermission "modifyThreadGroup";
};

grant codeBase "file:/path/to/tribuo/Clustering/KMeans/target/tribuo-clustering-kmeans-4.0.2-jar-with-dependencies.jar" {
        permission java.lang.RuntimePermission "modifyThread";
        permission java.lang.RuntimePermission "modifyThreadGroup";
        permission java.lang.RuntimePermission "accessDeclaredMembers";
        permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
        permission java.util.logging.LoggingPermission "control";
        permission java.io.FilePermission "<<ALL FILES>>", "read";
        permission java.util.PropertyPermission "*", "read,write";
};

The extra permissions are used to allow our reflective provenance system to work, and could be minimized further depending on which file locations are used. I used the following command line:

tribuo$ java -cp Clustering/KMeans/target/tribuo-clustering-kmeans-4.0.2-jar-with-dependencies.jar:. -Djava.security.manager -Djava.security.policy=tribuo.policy TestKMeans

Whenever I try to activate SecureSM on JVM startup in Java 11 it causes the JVM to crash. Can you provide me a JVM startup string which successfully enables the SecureSM?

I was using the following command line:

tribuo$ java -cp opensearch-secure-sm-1.2.0.jar:Clustering/KMeans/target/tribuo-clustering-kmeans-4.0.2-jar-with-dependencies.jar:. -Djava.security.manager=org.opensearch.secure_sm.SecureSM -Djava.security.policy=tribuo.policy TestKMeans

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Looks like adding System.setSecurityManager(new org.opensearch.secure_sm.SecureSM()); to the start of my program correctly installs SecureSM and then I can replicate the issue. This seems to be a specific interaction between SecureSM and ForkJoinPool, which isn't present when using the default Java SecurityManager, so it's really a problem for the developers of SecureSM, or OpenJDK's core-libs team.

Is single threaded execution sufficient when running under SecureSM? We could modify KMeansTrainer so it doesn't use a ForkJoinPool for a single thread, which should allow it to run, but as I mentioned above rewriting it to use a different threading mechanism for multi-threaded execution would be a breaking change and take some time. Given SecurityManager is deprecated for removal in Java 17, I'm not sure rearchitecting Tribuo to allow running with a non-standard SecurityManager like SecureSM is worthwhile.

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

Thanks @Craigacp.
Yes, SecureSM has stricter check than default SecurityManager. Kmeans will break as we are using SecureSM.

I think it's reasonable to remove ForkJoinPool for a single thread. But we may need to look for other ML lib if single-thread Kmeans doesn't perform well. So it will be good if you can replace ForkJoinPool for multi-thread as well, so we can save effort to integrate another ML lib.

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Ok, I've got a branch which doesn't use the FJP in single threaded computation - 4c62478. Do you know if the data you are clustering is sparse or not? If it's actually dense, then there are further performance optimisations which will let Tribuo use dense vectors rather than sparse ones which we can apply. We did this for linear sgd models in 4.1, but haven't got around to applying that optimisation everywhere yet.

We can probably backport the single threaded fix into 4.1.1 which will arrive around the same time 4.2.0 lands, but it's not going to be backported to 4.0.2.

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

Thanks a lot! I think that's ok to not backport to 4.0.2, we can use 4.1.1, or upgrade to 4.2.0 if it works.

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

@Craigacp , thanks for adding custom thread factory. Is this still true for your release plan?

We can probably backport the single threaded fix into 4.1.1 which will arrive around the same time 4.2.0 lands, but it's not going to be backported to 4.0.2.

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Yes, that's still the plan. I also implemented the dense vector optimization for KMeansTrainer, but I'm undecided if that should go into 4.2.0 as it makes a backwards incompatible change to a protected method signature. I've not seen anyone subclass KMeansTrainer, but that doesn't mean it hasn't happened.

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

Got it. If the BWC is a problem, how about split the PR so you can merge custom thread factory fix first? You can have some time to rethink how to support BWC and send a separate PR. One way I can think is adding a new method and deprecate current one.

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Sorry, I was unclear, the performance optimization is in a separate branch. We will merge the thread factory fix first and that will definitely be in 4.2.0 and 4.1.1.

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

Oh, I misunderstood, thought the optimization is in the same PR. That sounds good. The performance optimization is not blocking issue for us now. Will be good if we can have that in next release 4.2.0, but respect your plan.

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

@Craigacp seems v4.1.1 still not ready, I checked this https://mvnrepository.com/artifact/org.tribuo/tribuo-clustering-kmeans

from tribuo.

Craigacp avatar Craigacp commented on May 16, 2024

Yeah, sorry it got held up a little bit by some internal process. Still working through the 4.2.0 release process as well, which should hopefully be tagged next week, though the binaries for that will take longer as everyone is on vacation between Christmas & New Year.

from tribuo.

ylwu-amzn avatar ylwu-amzn commented on May 16, 2024

Sure, thanks a lot. No worry, not so urgent. Happy holiday!

from tribuo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.