Code Monkey home page Code Monkey logo

Comments (9)

hzongaro avatar hzongaro commented on July 20, 2024 1

It is unknown why a node of this form remained unoptimized in the trace file in Issue #19525..

@knn-k, I wonder whether the shift was introduced or the second operand was folded to zero after the last pass of Tree Simplifier or Value Propagation.

@dylanjtuttle, may I ask you to take a look at the opportunity illustrated in Example 3 that's been missed by the optimizer? This is something that the various handlers for shift operations in OMRSimplifierHandlers.cpp and VPHandlers.cpp need to be enhanced to catch.

from openj9.

knn-k avatar knn-k commented on July 20, 2024

Example 1 - Shift a value by 0 bits: (Seen only in a traceCG file in the next comment)

------------------------------
 n73616n  (  0)  treetop                                                                              [       0x12b14d250] bci=[191,12,255] rc=0 vc=8 vn=- li=3696 udi=- nc=1
 n73617n  (  2)    ishl                                                                               [       0x12b14d2a0] bci=[191,11,255] rc=2 vc=8 vn=- li=3696 udi=- nc=2
 n73591n  (  2)      ==>iadd (in GPR_4663)
 n73736n  (  3)      ==>iconst 0 (X==0 X>=0 X<=0 )
------------------------------

Example 2 - Shift a constant value by 0 bits: (Seen only in a traceCG file in the next comment)

------------------------------
 n73622n  (  0)  iRegStore x8  (privatizedInlinerArg )                                                [       0x12b14d430] bci=[191,28,257] rc=0 vc=8 vn=- li=3696 udi=- nc=1 flg=0x2000
 n73623n  (  5)    iadd                                                                               [       0x12b14d480] bci=[191,28,257] rc=5 vc=8 vn=- li=3696 udi=- nc=2
 n73624n  (  1)      ishl                                                                             [       0x12b14d4d0] bci=[191,27,257] rc=1 vc=8 vn=- li=3696 udi=- nc=2
 n73625n  (  1)        iconst 2 (X!=0 X>=0 )                                                          [       0x12b14d520] bci=[191,22,257] rc=1 vc=8 vn=- li=3696 udi=- nc=0 flg=0x104
 n73736n  (  2)        ==>iconst 0 (X==0 X>=0 X<=0 )
 n73595n  (  5)      ==>iloadi (in GPR_4664) (X>=0 cannotOverflow )
------------------------------

Example 3 - Shift constant 0:

------------------------------
 n73170n  (  0)  ArrayCopyBNDCHK [#1]                                                                 [       0x12b1446f0] bci=[200,16,5251] rc=0 vc=8 vn=- li=3664 udi=- nc=2
 n73171n  (  3)    ishl                                                                               [       0x12b144740] bci=[200,16,5251] rc=3 vc=8 vn=- li=3664 udi=- nc=2
 n73169n  (  3)      ==>iconst 0 (X==0 X>=0 X<=0 )
 n73150n  (  4)      ==>iloadi (in GPR_4865) (cannotOverflow )
 n73169n  (  3)    ==>iconst 0 (X==0 X>=0 X<=0 )
------------------------------

from openj9.

knn-k avatar knn-k commented on July 20, 2024

A traceCG file for the method on AArch64 macOS with MathLoadTest_bigdecimal_CS_5m_0: getCtor.txt.gz

I tried to generate a full trace file, but I couldn't make the test fail.

from openj9.

knn-k avatar knn-k commented on July 20, 2024

I got a traceFull file after all for the method, but it is too large (33MB after compressed by gzip) to attach here.
JIT compilation failed with EXCEPTION THROWN (std::bad_alloc) during the code generation. That seems to be the reason why I couldn't make the test fail when I was running with traceFull.

There are many ArrayCopyBNDCHK nodes with ishl that is shown as "Example 3" above.

n26493n   ArrayCopyBNDCHK [#1]                                                                [       0x145aa48d0] bci=[171,16,5251] rc=0 vc=10 vn=- li=1580 udi=- nc=2
n18812n     ishl                                                                              [       0x14246e7e0] bci=[171,16,5251] rc=3 vc=10 vn=- li=1580 udi=- nc=2
n26491n       ==>iconst 0
n18670n       ==>iloadi
n26491n     ==>iconst 0

from openj9.

knn-k avatar knn-k commented on July 20, 2024

The ArrayCopyBNDCHK node above is for the System.arraycopy() in String.getBytes(), where n26491n is srcPos and n18670n is coder in the Java code.
https://github.com/ibmruntimes/openj9-openjdk-jdk17/blob/217e7d19465aba6a73ccbe34ec372546340362f5/src/java.base/share/classes/java/lang/String.java#L5251

This ArrayCopyBNDCHK node can be removed because it compares 0 with 0.

from openj9.

knn-k avatar knn-k commented on July 20, 2024

I tried running MathLoadTest_bigdecimal_CS_5m_0 on ppc64le Linux with the this option: -Xjit:{java/lang/Class.getConstructor(*}(traceFull,log=getConstructor.txt,optlevel=scorching),scratchSpaceLimitKBForHotCompilations=2097152

It failed in the optimization phase with bad_alloc as shown below. The size of this JIT trace file is as large as 215MB.
It is unknown what the final IL tree would be like.

<optimization id=112 name=partialRedundancyElimination method=java/lang/Class.getConstructor([Ljava/lang/Class;)Ljava/lang/reflect/Constructor;>
Performing 112: partialRedundancyElimination

=== EXCEPTION THROWN (std::bad_alloc) ===
</jitlog>

https://na.artifactory.swg-devops.com/artifactory/sys-rt-generic-local/hyc-runtimes-jenkins.swg-devops.com/Grinder/41057/system_test_output.tar.gz

from openj9.

knn-k avatar knn-k commented on July 20, 2024

I got a traceFull log without bad_alloc on AArch64 macOS.
https://na.artifactory.swg-devops.com/artifactory/sys-rt-generic-local/hyc-runtimes-jenkins.swg-devops.com/Grinder/41058/system_test_output.tar.gz

It contains ArrayCopyBNDCHK nodes like this that can be optimized out. (Example 3 above)

------------------------------
 n22856n  (  0)  ArrayCopyBNDCHK [#1]                                                                 [       0x130f8d7f0] bci=[106,16,5251] rc=0 vc=10 vn=- li=989 udi=- nc=2
 n12022n  (  3)    ishl                                                                               [       0x124965d80] bci=[106,16,5251] rc=3 vc=10 vn=- li=989 udi=- nc=2
 n22854n  (  4)      ==>iconst 0 (X==0 X>=0 X<=0 )
 n11880n  (  3)      ==>iloadi (in GPR_1067) (cannotOverflow )
 n22854n  (  4)    ==>iconst 0 (X==0 X>=0 X<=0 )
------------------------------

The ishl node n12022n first appears after "id=15 name=inlining" as one of child nodes of a call to arraycopy, and the node does not get optimized.

n12030n     call  java/lang/System.arraycopy(Ljava/lang/Object;ILjava/lang/Object;II)V[#1967  final native static Method] [flags 0x20500 0x0 ] ()  [       0x124966000] bci=[106,27,5251] rc=1 vc=42 vn=- li=- udi=- nc=5 flg=0x20
n12019n       aloadi  java/lang/String.value [B[#1095  final Shadow +8] [flags 0x400a0607 0x0 ]  [       0x124965c90] bci=[106,10,5251] rc=1 vc=0 vn=- li=- udi=- nc=1
n12018n         aload  <temp slot 5>[#1933  Auto] [flags 0x7 0x0 ] (X!=0 )                    [       0x124965c40] bci=[106,9,5251] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x4
n12022n       ishl                                                                            [       0x124965d80] bci=[106,16,5251] rc=1 vc=0 vn=- li=- udi=- nc=2
n12167n         iconst 0 (X==0 X>=0 X<=0 )                                                    [       0x124968ad0] bci=[100,3,1724] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x302
n12021n         iload  <temp slot 9>[#1979  Auto] [flags 0x20000003 0x0 ]                     [       0x124965d30] bci=[106,14,5251] rc=1 vc=0 vn=- li=- udi=- nc=0
n12023n       aload  <temp slot 8>[#1978  Auto] [flags 0x20000007 0x0 ]                       [       0x124965dd0] bci=[106,17,5251] rc=1 vc=0 vn=- li=- udi=- nc=0
(Omitted other child nodes)

from openj9.

knn-k avatar knn-k commented on July 20, 2024

There is an ishl node that shifts by 0 bits (Example 1 above) in the traceFull file, but that is simplified in "id=46 name=treeSimplification" as expected.

Trees after loopCanonicalization (id=45):

n22937n   ArrayCopyBNDCHK [#1]                                                                [       0x130f8f140] bci=[143,21,5251] rc=0 vc=8155 vn=- li=- udi=- nc=2
n14961n     ishl                                                                              [       0x1300f3430] bci=[143,21,5251] rc=4 vc=8155 vn=- li=- udi=- nc=2
n14959n       iload  <temp slot 6>[#2294  Auto] [flags 0x3 0x0 ] (cannotOverflow )            [       0x1300f3390] bci=[143,18,5251] rc=1 vc=8155 vn=- li=- udi=2038 nc=0 flg=0x1000
n14960n       iconst 0 (X==0 X>=0 X<=0 cannotOverflow )                                       [       0x1300f33e0] bci=[143,19,5251] rc=1 vc=8155 vn=- li=- udi=- nc=0 flg=0x1302
n22936n     iconst 0 (X==0 X>=0 X<=0 )                                                        [       0x130f8f0f0] bci=[143,27,5251] rc=1 vc=8155 vn=- li=- udi=- nc=0 flg=0x302

Trees after treeSimplification (id=46):

n22937n   ArrayCopyBNDCHK [#1]                                                                [       0x130f8f140] bci=[143,21,5251] rc=0 vc=8169 vn=- li=- udi=- nc=2
n14959n     ==>iload
n22936n     iconst 0 (X==0 X>=0 X<=0 )                                                        [       0x130f8f0f0] bci=[143,27,5251] rc=1 vc=8169 vn=- li=- udi=- nc=0 flg=0x302

It is unknown why a node of this form remained unoptimized in the trace file in Issue #19525..

from openj9.

knn-k avatar knn-k commented on July 20, 2024

@hzongaro FYI.
Can anybody in the optimizer team look at the Example 3 case? Thanks.

from openj9.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.