Content-Length: 309115 | pFad | http://github.com/github/codeql/pull/20092

C6 Java: Improve more join-orders by aschackmull · Pull Request #20092 · github/codeql · GitHub
Skip to content

Java: Improve more join-orders #20092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

aschackmull
Copy link
Contributor

Follow-up to #20088 with a few more cases.

Commit 1: The double SSA-to-use occurrences were being joined, which meant a large fanout. On the other hand, we couldn't really go from b to len either, since that's an even larger fanout. The solution was to factor out the two functional joins from len to arr and to b, since that allows a two-column join.
Before:

[2025-07-18 15:24:29] Evaluated non-recursive predicate _#Bound::Bound.getExpr/0#dispred#43d310d3Merge_#Expr::FieldAccess.getField/0#dispred#29ef4aa0Merge_#__#shared@3fd433m3 in 62ms (size: 186).
Evaluated relational algebra for predicate _#Bound::Bound.getExpr/0#dispred#43d310d3Merge_#Expr::FieldAccess.getField/0#dispred#29ef4aa0Merge_#__#shared@3fd433m3 with tuple counts:
           92543    ~0%    {5} r1 = JOIN `_#Expr::ArrayAccess.getArray/0#dispred#b90c658aMerge_#Expr::ArrayAccess.getIndexExpr/0#dispred#345f6__#shared` WITH `project##RangeAnalysis::bounded/5#7594cba0Merge` ON FIRST 1 OUTPUT Lhs.0, Rhs.1, Rhs.2, Lhs.1, Lhs.2
           92543    ~0%    {4}    | JOIN WITH `project##RangeAnalysis::bounded/5#7594cba0Merge` ON FIRST 3 OUTPUT Lhs.4, Lhs.3, Lhs.1, Lhs.2
        49840023    ~5%    {4}    | JOIN WITH `cached_SsaImpl::getAUse/1#ca5d2675` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2, Lhs.3
            6670    ~0%    {4}    | JOIN WITH `#Expr::VarAccess.getQualifier/0#dispred#2b0f1cd1Merge_10#join_rhs` ON FIRST 1 OUTPUT Lhs.2, Rhs.1, Lhs.1, Lhs.3
             448    ~0%    {3}    | JOIN WITH `Bound::Bound.getExpr/0#dispred#43d310d3` ON FIRST 2 OUTPUT Lhs.1, Lhs.2, Lhs.3
             448  ~154%    {3}    | JOIN WITH `Expr::FieldAccess.getField/0#dispred#29ef4aa0` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2
                           return r1

After:

[2025-07-18 15:34:43] Evaluated non-recursive predicate ArrayIndexOutOfBounds::ssaArrayLengthBound/2#5b8b1599@454c43ro in 1ms (size: 173).
Evaluated relational algebra for predicate ArrayIndexOutOfBounds::ssaArrayLengthBound/2#5b8b1599@454c43ro with tuple counts:
        181548   ~2%    {2} r1 = SCAN `Expr::FieldAccess.getField/0#dispred#29ef4aa0` OUTPUT In.1, In.0
           322   ~0%    {1}    | JOIN WITH JDK::ArrayLengthField#0681d50f ON FIRST 1 OUTPUT Lhs.1
           322   ~0%    {2}    | JOIN WITH `Expr::VarAccess.getQualifier/0#dispred#2b0f1cd1` ON FIRST 1 OUTPUT Rhs.1, Lhs.0
           272   ~0%    {2}    | JOIN WITH `#SSA::SsaVariable.getAUse/0#dispred#55da2912Merge_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1, Rhs.1
           272  ~58%    {2}    | JOIN WITH `#Bound::Bound.getExpr/0#dispred#43d310d3Merge_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1, Rhs.1
                        return r1
[2025-07-18 15:34:43] Evaluated non-recursive predicate _ArrayIndexOutOfBounds::ssaArrayLengthBound/2#5b8b1599__#Expr::ArrayAccess.getArray/0#dispred#b90c65__#shared@1e987dll in 1ms (size: 186).
Evaluated relational algebra for predicate _ArrayIndexOutOfBounds::ssaArrayLengthBound/2#5b8b1599__#Expr::ArrayAccess.getArray/0#dispred#b90c65__#shared@1e987dll with tuple counts:
        92543  ~1%    {5} r1 = JOIN `_#Expr::ArrayAccess.getArray/0#dispred#b90c658aMerge_#Expr::ArrayAccess.getIndexExpr/0#dispred#345f6__#shared` WITH `project##RangeAnalysis::bounded/5#7594cba0Merge` ON FIRST 1 OUTPUT Lhs.2, Rhs.1, Lhs.1, Lhs.0, Rhs.2
          186  ~0%    {4}    | JOIN WITH `ArrayIndexOutOfBounds::ssaArrayLengthBound/2#5b8b1599` ON FIRST 2 OUTPUT Lhs.3, Lhs.1, Lhs.4, Lhs.2
                      return r1

Commit 2: Three TCs in one predicate was a bit much. Also the constraining part was the nested try statements, while the other two TCs were pure fanout, so the result looked a bit large. Therefore I added some manual magic in the form of mayThrow, and split the TCs a bit so they weren't all in one predicate.
Before:

[2025-07-18 15:46:08] Evaluated non-recursive predicate PartiallyMaskedCatch::caughtInside/3#0bbcdbb3#bfb@a42903fj in 29ms (size: 95737).
Evaluated relational algebra for predicate PartiallyMaskedCatch::caughtInside/3#0bbcdbb3#bfb@a42903fj with tuple counts:
             182   ~2%    {2} r1 = JOIN `_##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#sinkBound#2#3_#Statement::TryStmt__#shared` WITH `doublyBoundedFastTC:#Statement::Stmt.getEnclosingStmt/0#dispred#2e155224Merge_10#higher_order_body:_##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#sinkBound#3#higher_order_body:##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#sourceBound#5#6` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
             182   ~0%    {3}    | JOIN WITH `Statement::TryStmt.getBlock/0#dispred#152b3c98` ON FIRST 1 OUTPUT Lhs.0, Lhs.1, Rhs.1
             204   ~6%    {3}    | JOIN WITH `project#PartiallyMaskedCatch::caughtType/2#02780726#2` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2
             201   ~1%    {3}    | JOIN WITH `m#PartiallyMaskedCatch::caughtInside/3#0bbcdbb3#bfb#mcpe_rt` ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Lhs.0
             536   ~0%    {3}    | JOIN WITH `boundedFastTC:#Statement::Stmt.getEnclosingStmt/0#dispred#2e155224Merge_10#higher_order_body:_##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#sinkBound#3_#Statement::TryStmt.g__#higher_order_body` ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.2
                      
            2872   ~0%    {2} r2 = SCAN `##Type::RefType.hasSubtype/1#dispred#fe22f67eMergePlus#bf` OUTPUT In.1, In.0
            2267   ~8%    {2}    | JOIN WITH `m#PartiallyMaskedCatch::caughtInside/3#0bbcdbb3#bfb#mcpe_rt` ON FIRST 1 OUTPUT Lhs.1, Lhs.0
         2809940   ~2%    {2}    | JOIN WITH `project#PartiallyMaskedCatch::caughtType/2#02780726#2_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
         2809940   ~0%    {3}    | JOIN WITH `Statement::TryStmt.getBlock/0#dispred#152b3c98` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0
        16667546   ~2%    {3}    | JOIN WITH `boundedFastTC:#Statement::Stmt.getEnclosingStmt/0#dispred#2e155224Merge_10#higher_order_body:_#Statement::TryStmt.getBlock/0#dispred#152b3c98Merge__##Type::RefType.hasSubtype/1#dispred#fe22f67e__#higher_order_body` ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Rhs.1
           95217   ~1%    {3}    | JOIN WITH `doublyBoundedFastTC:#Statement::Stmt.getEnclosingStmt/0#dispred#2e155224Merge:_##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#sourceBound#5#6_#Statement::TrySt__#higher_order_body:##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#sinkBound#3` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2
           95217   ~0%    {3}    | JOIN WITH `#Statement::TryStmt.getBlock/0#dispred#152b3c98Merge_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2
           95217   ~6%    {3}    | JOIN WITH `##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#sinkBound#2#3` ON FIRST 1 OUTPUT Lhs.0, Lhs.2, Lhs.1
                      
           95753   ~6%    {3} r3 = r1 UNION r2
                          return r3

After:

[2025-07-18 16:04:17] Evaluated non-recursive predicate PartiallyMaskedCatch::mayThrow/2#b9e3a2a3@ad0cfa4p in 7ms (size: 130082).
Evaluated relational algebra for predicate PartiallyMaskedCatch::mayThrow/2#b9e3a2a3@ad0cfa4p with tuple counts:
         13656  ~10%    {2} r1 = SCAN `Statement::ThrowStmt.getExpr/0#dispred#9a277d20` OUTPUT In.1, In.0
         13656   ~0%    {2}    | JOIN WITH `Expr::Expr.getType/0#dispred#ac12d976` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
         13656   ~9%    {2}    | JOIN WITH @reftype ON FIRST 1 OUTPUT Lhs.1, Lhs.0
                    
        205479   ~1%    {2} r2 = JOIN `#Exception::Exception.getType/0#dispred#a4e76769Merge_10#join_rhs` WITH @reftype ON FIRST 1 OUTPUT Lhs.1, Lhs.0
        205479   ~0%    {2}    | JOIN WITH `#Member::Callable.getAnException/0#dispred#bdcfa39eMerge_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
        118877   ~1%    {2}    | JOIN WITH `#Expr::Call.getCallee/0#dispred#3c1718adMerge_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
        118877   ~5%    {2}    | JOIN WITH `Expr::Call.getEnclosingStmt/0#dispred#46ad9d6f` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
                    
        132533   ~4%    {2} r3 = r1 UNION r2
                        return r3
[2025-07-18 16:04:17] Evaluated non-recursive predicate PartiallyMaskedCatch::nestedTry/2#f9e65d35@fa65ebum in 0ms (size: 678).
Evaluated relational algebra for predicate PartiallyMaskedCatch::nestedTry/2#f9e65d35@fa65ebum with tuple counts:
         10713  ~0%    {2} r1 = SCAN `Statement::TryStmt.getBlock/0#dispred#152b3c98` OUTPUT In.1, In.0
        100781  ~0%    {2}    | JOIN WITH `##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#flipped` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
           678  ~0%    {2}    | JOIN WITH Statement::TryStmt#15005db3 ON FIRST 1 OUTPUT Lhs.1, Lhs.0
                       return r1
[2025-07-18 16:04:17] Evaluated non-recursive predicate PartiallyMaskedCatch::caughtBy/3#10b80aa2@dca1ee0k in 1ms (size: 13382).
Evaluated relational algebra for predicate PartiallyMaskedCatch::caughtBy/3#10b80aa2@dca1ee0k with tuple counts:
         8221   ~0%    {3} r1 = JOIN `Statement::TryStmt.getBlock/0#dispred#152b3c98` WITH `project#PartiallyMaskedCatch::caughtType/2#02780726#2` ON FIRST 1 OUTPUT Lhs.1, Lhs.0, Rhs.1
                   
        74479   ~0%    {3} r2 = JOIN r1 WITH `##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#flipped` ON FIRST 1 OUTPUT Rhs.1, Lhs.2, Lhs.1
         7333   ~0%    {3}    | JOIN WITH `PartiallyMaskedCatch::mayThrow/2#b9e3a2a3` ON FIRST 2 OUTPUT Lhs.2, Lhs.0, Lhs.1
                   
        74479   ~5%    {3} r3 = JOIN r1 WITH `##Statement::Stmt.getEnclosingStmt/0#dispred#2e155224MergePlus#fb#flipped` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2
        18840   ~2%    {4}    | JOIN WITH `PartiallyMaskedCatch::mayThrow/2#b9e3a2a3` ON FIRST 1 OUTPUT Lhs.2, Rhs.1, Lhs.1, Lhs.0
         6190   ~0%    {3}    | JOIN WITH `##Type::RefType.hasSubtype/1#dispred#fe22f67eMergePlus#bf` ON FIRST 2 OUTPUT Lhs.2, Lhs.3, Lhs.1
                   
        13523   ~3%    {3} r4 = r2 UNION r3
                       return r4
[2025-07-18 16:04:17] Evaluated non-recursive predicate PartiallyMaskedCatch::caughtInside/3#0bbcdbb3#bff@ad2e64ft in 0ms (size: 176).
Evaluated relational algebra for predicate PartiallyMaskedCatch::caughtInside/3#0bbcdbb3#bff@ad2e64ft with tuple counts:
        317  ~0%    {2} r1 = JOIN `project#PartiallyMaskedCatch::caughtType/2#02780726#4` WITH `PartiallyMaskedCatch::nestedTry/2#f9e65d35` ON FIRST 1 OUTPUT Rhs.1, Lhs.0
        176  ~0%    {3}    | JOIN WITH `PartiallyMaskedCatch::caughtBy/3#10b80aa2` ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Rhs.2
                    return r1

@Copilot Copilot AI review requested due to automatic review settings July 18, 2025 14:30
@aschackmull aschackmull requested a review from a team as a code owner July 18, 2025 14:30
@github-actions github-actions bot added the Java label Jul 18, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves join order optimization in Java CodeQL queries by refactoring two predicates to reduce computational complexity and improve performance. The changes focus on breaking down expensive multi-way joins into smaller, more efficient operations.

  • Factor out expensive join operations into separate predicates with pragma[nomagic] annotations
  • Split complex transitive closure operations to avoid large fanout issues
  • Introduce intermediate predicates to enable more efficient two-column joins

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
ArrayIndexOutOfBounds.ql Extracts array length bound logic into separate ssaArrayLengthBound predicate to optimize join ordering
PartiallyMaskedCatch.ql Refactors complex predicate into three smaller predicates (mayThrow, caughtBy, nestedTry) to reduce transitive closure complexity

@aschackmull aschackmull added the no-change-note-required This PR does not need a change note label Jul 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/github/codeql/pull/20092

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy