- 适用于 JDK 23 的 GraalVM(最新版)
- 适用于 JDK 24 的 GraalVM(抢先体验版)
- 适用于 JDK 21 的 GraalVM
- 适用于 JDK 17 的 GraalVM
- 存档
- 开发构建版
- Truffle 语言实现框架
- Truffle 分支插桩
- 动态对象模型
- 静态对象模型
- 解释器代码的宿主优化
- Truffle 函数内联方法
- Truffle 解释器性能分析
- Truffle Interop 2.0
- 语言实现
- 使用 Truffle 实现新语言
- Truffle 语言和工具迁移到 Java 模块
- Truffle 原生函数接口
- Truffle 解释器优化
- 选项
- 栈上替换
- Truffle 字符串指南
- 特化直方图
- 测试 DSL 特化
- 基于 Polyglot API 的 TCK
- Truffle 编译队列方法
- Truffle 库指南
- Truffle AOT 概述
- Truffle AOT 编译
- 辅助引擎缓存
- Truffle 语言安全点教程
- 单态化
- 拆分算法
- 单态化用例
- 向运行时报告多态特化
Truffle 解释器优化
本文档讨论了用于优化或调试 Truffle 解释器以获得最佳时间性能的工具。
策略 #
-
使用分析器对应用程序进行采样并识别负责的编译单元。使用采样延迟 (
--cpusampler.Delay=MILLISECONDS
) 仅在预热后进行分析。请参阅性能分析指南。 -
了解正在编译的内容并查找去优化。主要在解释器中运行的方法可能存在去优化问题。
-
在仍然显示性能问题的情况下,尽可能简化代码。
-
启用性能警告并列出边界调用。
- 转储负责编译单元的 Graal 图,并查看
After TruffleTier
阶段。- 查看
After TruffleTier
和After PartialEscape
阶段的 Graal 图,并检查它是否符合您的预期。如果其中存在您不希望存在的节点,请考虑如何防止包含它们。如果其中存在比您想要的更复杂的节点,请考虑如何添加生成更简单代码的特化。如果您认为基准测试中应该存在的节点不存在,请考虑如何使值动态化,以免它们被优化掉。
- 查看
-
在 Graal IR 中搜索
Invoke
节点。不代表访客语言调用的Invoke
节点应该被特化掉。这可能并非总是可行的,例如,如果该方法执行 I/O。 -
搜索控制流拆分(红线)并调查它们是由访客应用程序引起的控制流还是仅仅是语言实现的产物。如果可能,应避免后者。
- 搜索线性代码中的间接寻址(
Load
和LoadIndexed
)并尝试最小化代码。热路径上的代码越少越好。
Truffle 编译器选项 #
最新的专家和内部选项的完整列表可以在选项指南中找到。
观察编译 #
本节概述了大多数可用的命令行选项以观察编译。
注意:大多数选项还需要设置额外的 --experimental-options
标志。
--engine.TraceCompilation
命令在每次编译方法时打印一行
[engine] opt done id=244 EqualityConstraint.execute |Tier 1|Time 268( 220+47 )ms|AST 17|Inlined 0Y 2N|IR 238/ 437|CodeSize 1874|Timestamp 758868036671903|Src octane-deltablue.js:528
以下是这些日志中提供的信息的快速概述
id
- 调用目标的唯一标识符。Tier
- 为哪个编译层安排了目标。Time
- 编译持续了多长时间,Truffle 层(主要是部分评估)和 Graal 层之间有所区别。AST
- 目标的非平凡节点数。Inlined
- 内联了多少调用以及内联后剩余多少调用。IR
- 部分评估和编译后 Graal 节点数。CodeSize
- 为调用目标生成的代码的大小。Timestamp
- 事件发生的时间,由System.nanoTime()
报告。Src
- 调用目标的缩写源代码部分。
--engine.TraceCompilationDetails
命令在编译排队、取消排队、启动或完成时打印一行
[engine] opt queued id=237 BinaryConstraint.output |Tier 1|Count/Thres 25/ 25|Queue: Size 1 Change +1 Load 0.06 Time 0us|Timestamp 758865671350686|Src octane-deltablue.js:416
[engine] opt start id=237 BinaryConstraint.output |Tier 1|Priority 25|Rate 0.000000|Queue: Size 0 Change +0 Load 0.06 Time 0us|Timestamp 758865708273384|Src octane-deltablue.js:416
[engine] opt queued id=239 OrderedCollection.size |Tier 1|Count/Thres 25/ 25|Queue: Size 1 Change +1 Load 0.06 Time 0us|Timestamp 758865727664193|Src octane-deltablue.js:71
[engine] opt queued id=52 Array.prototype.push |Tier 1|Count/Thres 25/ 50|Queue: Size 2 Change +1 Load 0.13 Time 0us|Timestamp 758865744191674|Src <builtin>:1
... more log ...
[engine] opt start id=239 OrderedCollection.size |Tier 1|Priority 181875|Rate 0.000001|Queue: Size 11 Change -1 Load 0.63 Time 575us|Timestamp 758866381654116|Src octane-deltablue.js:71
[engine] opt done id=237 BinaryConstraint.output |Tier 1|Time 717( 654+64 )ms|AST 19|Inlined 0Y 0N|IR 143/ 220|CodeSize 882|Timestamp 758866435391354|Src octane-deltablue.js:416
[engine] opt start id=236 BinaryConstraint.input |Tier 1|Priority 144000|Rate 0.000001|Queue: Size 10 Change -1 Load 0.56 Time 48us|Timestamp 758866452554530|Src octane-deltablue.js:409
... more log ...
[engine] opt queued id=239 OrderedCollection.size |Tier 2|Count/Thres 8750/ 8125|Queue: Size 18 Change +1 Load 0.81 Time 0us|Timestamp 758867756295139|Src octane-deltablue.js:71
[engine] opt queued id=237 BinaryConstraint.output |Tier 2|Count/Thres 8499/ 8750|Queue: Size 19 Change +1 Load 0.88 Time 0us|Timestamp 758867758263099|Src octane-deltablue.js:416
[engine] opt start id=244 EqualityConstraint.execute |Tier 1|Priority 2618289|Rate 0.000015|Queue: Size 19 Change -1 Load 0.88 Time 180us|Timestamp 758867767116908|Src octane-deltablue.js:528
... more log ...
[engine] opt done id=246 OrderedCollection.at |Tier 2|Time 89( 80+9 )ms|AST 15|Inlined 0Y 0N|IR 50/ 110|CodeSize 582|Timestamp 758873628915755|Src octane-deltablue.js:67
[engine] opt start id=237 BinaryConstraint.output |Tier 2|Priority 173536|Rate 0.000054|Queue: Size 18 Change -1 Load 0.94 Time 18us|Timestamp 758873629411012|Src octane-deltablue.js:416
[engine] opt queued id=238 Planner.addPropagate |Tier 2|Count/Thres 9375/ 8750|Queue: Size 19 Change +1 Load 0.88 Time 0us|Timestamp 758873663196884|Src octane-deltablue.js:696
[engine] opt queued id=226 Variable.addConstraint |Tier 2|Count/Thres 8771/ 9375|Queue: Size 20 Change +1 Load 0.94 Time 0us|Timestamp 758873665823697|Src octane-deltablue.js:556
[engine] opt done id=293 change |Tier 1|Time 167( 130+37 )ms|AST 60|Inlined 0Y 6N|IR 576/ 1220|CodeSize 5554|Timestamp 758873669483749|Src octane-deltablue.js:867
[engine] opt start id=270 Plan.execute |Tier 2|Priority 157871|Rate 0.000072|Queue: Size 19 Change -1 Load 1.00 Time 17us|Timestamp 758873669912101|Src octane-deltablue.js:778
[engine] opt done id=237 BinaryConstraint.output |Tier 2|Time 58( 52+6 )ms|AST 19|Inlined 0Y 0N|IR 103/ 181|CodeSize 734|Timestamp 758873687678394|Src octane-deltablue.js:416
... more log ...
[engine] opt unque. id=304 Date.prototype.valueOf |Tier 2|Count/Thres 80234/ 3125|Queue: Size 4 Change 0 Load 0.31 Time 0us|Timestamp 758899904132076|Src <builtin>:1|Reason Target inlined into only caller
以下是这些日志中添加的信息的快速概述
Count/Thres
- 目标的调用和循环计数是多少,以及将编译添加到队列所需的阈值是多少。Queue: Size
- 编译队列中有多少编译。Queue: Change
- 此事件如何影响编译队列(例如,某些事件可以清除队列中不需要的编译任务)。Queue: Load
- 队列是否过载/欠载的指标。正常负载用 1 表示,小于 1 表示欠载,大于 1 表示过载。Queue: Time
- 事件持续了多长时间。Reason
- 运行时报告的事件原因。
--engine.TraceCompilationAST
命令为每次编译打印 Truffle AST
[engine] opt AST OrderedCollection.size <split-57429b3a> |ASTSize 10/ 10 |Calls/Thres 10559/ 3 |CallsAndLoop/Thres 10559/ 1000
FunctionRootNode
body = FunctionBodyNode
body = DualNode
left = JSWriteCurrentFrameSlotNodeGen
rhsNode = JSPrepareThisNodeGen
operandNode = AccessThisNode
right = TerminalPositionReturnNode
expression = PropertyNode
target = PropertyNode
target = JSReadCurrentFrameSlotNodeGen
cache = PropertyGetNode
cacheNode = ObjectPropertyGetNode
receiverCheck = ShapeCheckNode
cache = PropertyGetNode
cacheNode = ArrayLengthPropertyGetNode
receiverCheck = ShapeCheckNode
arrayLengthRead = ArrayLengthReadNodeGen
--engine.TraceInlining
命令为每次编译打印访客语言内联决策
[engine] inline start Plan.execute |call diff 0.00 |Recursion Depth 0 |Explore/inline ratio 1.00 |IR Nodes 2704 |Frequency 1.00 |Truffle Callees 5 |Forced false |Depth 0
[engine] Inlined Plan.size |call diff -203.75 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 175 |Frequency 101.88 |Truffle Callees 1 |Forced false |Depth 1
[engine] Inlined OrderedCollection.size <split-e13c02e> |call diff -101.88 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 157 |Frequency 101.88 |Truffle Callees 0 |Forced false |Depth 2
[engine] Inlined Plan.constraintAt |call diff -201.75 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 206 |Frequency 100.88 |Truffle Callees 1 |Forced false |Depth 1
[engine] Inlined OrderedCollection.at |call diff -100.88 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 232 |Frequency 100.88 |Truffle Callees 0 |Forced false |Depth 2
[engine] Inlined ScaleConstraint.execute |call diff -0.00 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 855 |Frequency 0.00 |Truffle Callees 0 |Forced false |Depth 1
[engine] Inlined EqualityConstraint.execute |call diff -299.63 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 295 |Frequency 99.88 |Truffle Callees 2 |Forced false |Depth 1
[engine] Inlined BinaryConstraint.output <split-1e163df7> |call diff -99.88 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 259 |Frequency 99.88 |Truffle Callees 0 |Forced false |Depth 2
[engine] Inlined BinaryConstraint.input <split-2dfade22> |call diff -99.88 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 259 |Frequency 99.88 |Truffle Callees 0 |Forced false |Depth 2
[engine] Inlined EditConstraint.execute |call diff -1.00 |Recursion Depth 0 |Explore/inline ratio NaN |IR Nodes 22 |Frequency 1.00 |Truffle Callees 0 |Forced false |Depth 1
[engine] inline done Plan.execute |call diff 0.00 |Recursion Depth 0 |Explore/inline ratio 1.00 |IR Nodes 2704 |Frequency 1.00 |Truffle Callees 5 |Forced false |Depth 0
--engine.TraceSplitting
命令打印访客语言拆分决策
[engine] split 0-4310d43-1 Strength |ASTSize 6/ 6 |Calls/Thres 2/ 3 |CallsAndLoop/Thres 2/ 1000 |SourceSection /Users/christianhumer/graal/4dev/js-benchmarks/octane-deltablue.js~139:4062-4089
[engine] split 1-4b0d79fc-1 Strength |ASTSize 6/ 6 |Calls/Thres 2/ 3 |CallsAndLoop/Thres 2/ 1000 |SourceSection /Users/christianhumer/graal/4dev/js-benchmarks/octane-deltablue.js~140:4119-4150
--engine.TracePerformanceWarnings=(call|instanceof|store|all)
命令打印对性能可能不理想的代码。当部分评估无法内联虚拟运行时调用时,call
会启用警告。当部分评估无法将虚拟 instanceof 解析为确切类型时,instanceof
会启用警告。当存储位置参数不是部分评估常量时,store
会启用警告
[engine] perf warn ScaleConstraint.execute |Partial evaluation could not inline the virtual runtime call Virtual to HotSpotMethod<ConditionProfile.profile(boolean)> (167|MethodCallTarget).
Approximated stack trace for [167 | MethodCallTarget: at com.oracle.truffle.js.nodes.control.IfNode.execute(IfNode.java:158)
at com.oracle.truffle.js.nodes.binary.DualNode.execute(DualNode.java:125)
at com.oracle.truffle.js.nodes.function.FunctionBodyNode.execute(FunctionBodyNode.java:73)
at com.oracle.truffle.js.nodes.function.FunctionRootNode.executeInRealm(FunctionRootNode.java:147)
at com.oracle.truffle.js.runtime.JavaScriptRealmBoundaryRootNode.execute(JavaScriptRealmBoundaryRootNode.java:93)
at com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:503)
at com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:480)
--engine.CompilationStatistics
命令在进程结束时打印编译统计信息
[engine] Truffle runtime statistics for engine 1
Compilations : 2763
Success : 2743
Temporary Bailouts : 17
jdk.graal.compiler.core.common.CancellationBailoutException: Compilation cancelled.: 16
jdk.graal.compiler.core.common.RetryableBailoutException: Assumption invalidated while compiling code: IsolatedObjectConstant[Object]: 1
Permanent Bailouts : 0
Failed : 0
Interrupted : 3
Invalidated : 84
Unknown Reason : 45
Profiled Argument Types : 22
validRootAssumption Split call node: 12
expression invalidatePropertyAssumption: 1
getTextPos invalidatePropertyAssumption: 1
X_OK invalidateAllPropertyAssumptions: 1
statements invalidatePropertyAssumption: 1
typed object location generalizing object type !class com.oracle.truffle.js.runtime.objects.Nullish => !class java.lang.Object Object@0[final=false][type=class com.oracle.truffle.js.runtime.objects.Nullish]: 1
Queues : 3410
Dequeues : 262
Split call node : 124
Target inlined into only caller: 87
validRootAssumption Split call node: 28
null : 17
Profiled Argument Types : 6
Splits : 5581
Compilation Accuracy : 0,969598
Queue Accuracy : 0,923167
Compilation Utilization : 2,685016
Remaining Compilation Queue : 409
Time to queue : count=3410, sum=12895957640, min= 0, average= 3781805,76, max=11311217, maxTarget=Math.floor
Time waiting in queue : count=2763, sum=1247424699, min= 40, average= 451474,74, max= 4762017, maxTarget=tryGetTypeFromEffectiveTypeNode
--------------------------- :
AST node statistics :
Truffle node count : count=2752, sum= 348085, min= 1, average= 126,48, max= 5404, maxTarget=checkSignatureDeclaration
Trivial : count=2752, sum= 124817, min= 0, average= 45,36, max= 2707, maxTarget=visitEachChild
Non Trivial : count=2752, sum= 223268, min= 1, average= 81,13, max= 3162, maxTarget=checkSignatureDeclaration
Monomorphic : count=2752, sum= 217660, min= 1, average= 79,09, max= 3042, maxTarget=checkSignatureDeclaration
Polymorphic : count=2752, sum= 4239, min= 0, average= 1,54, max= 71, maxTarget=checkSignatureDeclaration
Megamorphic : count=2752, sum= 1369, min= 0, average= 0,50, max= 49, maxTarget=checkSignatureDeclaration
Truffle call count : count=2752, sum= 7789, min= 0, average= 2,83, max= 110, maxTarget=forEachChild
Indirect : count=2752, sum= 29, min= 0, average= 0,01, max= 3, maxTarget=emitLeadingComments
Direct : count=2752, sum= 7760, min= 0, average= 2,82, max= 110, maxTarget=forEachChild
Dispatched : count=2752, sum= 6285, min= 0, average= 2,28, max= 110, maxTarget=forEachChild
Inlined : count=2752, sum= 1475, min= 0, average= 0,54, max= 60, maxTarget=parseList <split-8113>
---------- :
Cloned : count=2752, sum= 0, min= 0, average= 0,00, max= 0, maxTarget=Array.prototype.push
Not Cloned : count=2752, sum= 7747, min= 0, average= 2,82, max= 110, maxTarget=forEachChild
Truffle loops : count=2752, sum= 723, min= 0, average= 0,26, max= 12, maxTarget=pipelineEmitWithComments
--------------------------- :
Compilation Tier 1 :
Compilation Rate : 471075,74 bytes/second
Truffle Tier Rate : 1273853,94 bytes/second
Graal Tier Rate : 922948,70 bytes/second
Installation Rate : 5640071,94 bytes/second
Time for compilation (us) : count=2637, sum= 20538922, min= 727, average= 7788,75, max= 2154173, maxTarget=createPrinter
Truffle Tier (us) : count=2623, sum= 7595366, min= 255, average= 2895,68, max= 306107, maxTarget=createPrinter
Graal Tier (us) : count=2623, sum= 10483126, min= 378, average= 3996,62, max= 1665018, maxTarget=createPrinter
Code Installation (us) : count=2623, sum= 1715472, min= 39, average= 654,01, max= 183048, maxTarget=createPrinter
Graal node count :
After Truffle Tier : count=2627, sum= 1089218, min= 88, average= 414,62, max= 12999, maxTarget=forEachChild
After Graal Tier : count=2624, sum= 2190826, min= 127, average= 834,92, max= 63837, maxTarget=createPrinter
Graal compilation result :
Code size : count=2623, sum= 9675388, min= 492, average= 3688,67, max= 238448, maxTarget=createPrinter
Total frame size : count=2623, sum= 350080, min= 64, average= 133,47, max= 5328, maxTarget=createPrinter
Exception handlers : count=2623, sum= 9316, min= 1, average= 3,55, max= 125, maxTarget=forEachChild
Infopoints : count=2623, sum= 105619, min= 5, average= 40,27, max= 1837, maxTarget=forEachChild
CALL : count=2623, sum= 105619, min= 5, average= 40,27, max= 1837, maxTarget=forEachChild
Marks : count=2623, sum= 13115, min= 5, average= 5,00, max= 5, maxTarget=Array.prototype.push
Data references : count=2623, sum= 185670, min= 9, average= 70,79, max= 4153, maxTarget=createPrinter
--------------------------- :
Compilation Tier 2 :
Compilation Rate : 141825,62 bytes/second
Truffle Tier Rate : 397107,38 bytes/second
Graal Tier Rate : 243094,58 bytes/second
Installation Rate : 4122848,22 bytes/second
Time for compilation (us) : count= 123, sum= 11418247, min= 745, average= 92831,28, max= 456707, maxTarget=checkSignatureDeclaration
Truffle Tier (us) : count= 120, sum= 4077990, min= 288, average= 33983,25, max= 152158, maxTarget=WhileNode$WhileDoRepeatingNode@72d2eaac typescript-polybench.js:16129~ 'utf8ToBytes'<OSR>
Graal Tier (us) : count= 120, sum= 6661604, min= 310, average= 55513,37, max= 343028, maxTarget=checkSignatureDeclaration
Code Installation (us) : count= 120, sum= 392786, min= 42, average= 3273,22, max= 68576, maxTarget=JSArrayBufferView.@52bcd02
Graal node count :
After Truffle Tier : count= 125, sum= 331695, min= 13, average= 2653,56, max= 11989, maxTarget=checkSignatureDeclaration
After Graal Tier : count= 120, sum= 432639, min= 46, average= 3605,33, max= 19199, maxTarget=checkSignatureDeclaration
Graal compilation result :
Code size : count= 120, sum= 1619400, min= 192, average= 13495,00, max= 72744, maxTarget=checkSignatureDeclaration
Total frame size : count= 120, sum= 25728, min= 48, average= 214,40, max= 768, maxTarget=bindEach <split-2459>
Exception handlers : count= 120, sum= 1080, min= 0, average= 9,00, max= 84, maxTarget=checkSignatureDeclaration
Infopoints : count= 120, sum= 7236, min= 3, average= 60,30, max= 401, maxTarget=checkSignatureDeclaration
CALL : count= 120, sum= 7236, min= 3, average= 60,30, max= 401, maxTarget=checkSignatureDeclaration
Marks : count= 120, sum= 1116, min= 5, average= 9,30, max= 35, maxTarget=checkIndexConstraintForProperty
Data references : count= 120, sum= 25252, min= 4, average= 210,43, max= 1369, maxTarget=checkSignatureDeclaration
--engine.CompilationStatisticDetails
命令除了打印之前的编译统计信息外,还打印各个 Graal 节点的直方图信息
Graal nodes after Truffle tier :
FrameState : count= 168, sum= 35502, min= 1, average= 211.32, max= 2048, maxTarget=deltaBlue
FixedGuardNode : count= 168, sum= 18939, min= 0, average= 112.73, max= 1048, maxTarget=change
LoadFieldNode : count= 168, sum= 14432, min= 0, average= 85.90, max= 814, maxTarget=EditConstraint
...
Graal nodes after Graal tier :
BeginNode : count= 166, sum= 33333, min= 0, average= 200.80, max= 2110, maxTarget=change
FrameState : count= 166, sum= 30591, min= 0, average= 184.28, max= 2393, maxTarget=MeasureDefault
AMD64AddressNode : count= 166, sum= 20072, min= 0, average= 120.92, max= 1960, maxTarget=MeasureDefault
...
--engine.TraceMethodExpansion=truffleTier
命令打印所有扩展 Java 方法的树,并在每次编译后显示统计信息
[engine] Expansion tree for test after truffleTier:
Name Frequency | Count Size Cycles Ifs Loops Invokes Allocs | Self Count Size Cycles Ifs Loops Invokes Allocs | IRNode ASTNode Lang:File:Line:Chars
<root> 1.00 | 64 72 42 1 1 0 1 | 34 20 0 0 0 0 0 | -
OptimizedCallTarget.profiledPERoot(Object) 1.00 | 30 52 42 1 1 0 1 | 1 2 2 0 0 0 0 | 121
OptimizedCallTarget.injectArgumentsProfile(Object) 1.00 | 9 19 16 0 0 0 0 | 4 3 0 0 0 0 0 | 5
OptimizedCallTarget.unsafeCast(Object, Class, Z, Z, Z) 1.00 | 1 0 0 0 0 0 0 | 1 0 0 0 0 0 0 | 10
OptimizedCallTarget.castArgumentsImpl(Object, Class) 1.00 | 4 16 16 0 0 0 0 | 4 16 16 0 0 0 0 | 12
OptimizedCallTarget.executeRootNode(VirtualFrame) 1.00 | 20 31 24 1 1 0 1 | 0 0 0 0 0 0 0 |
JavaScriptRealmBoundaryRootNode.execute(VirtualFrame) 1.00 | 20 31 24 1 1 0 1 | 1 1 0 0 0 0 0 | 34 0 js:test.js:1:0-100
JavaScriptRealmBoundaryRootNode.getRealm() 1.00 | 1 1 0 0 0 0 0 | 0 0 0 0 0 0 0 | 0 js:test.js:1:0-100
JSContext.getRealm() 1.00 | 1 1 0 0 0 0 0 | 0 0 0 0 0 0 0 |
PolyglotReferences$AssumeSingleContext.get() 1.00 | 1 1 0 0 0 0 0 | 0 0 0 0 0 0 0 |
PolyglotReferences$WeakSingleContext.get() 1.00 | 1 1 0 0 0 0 0 | 1 1 0 0 0 0 0 | 37
FunctionRootNode.executeInRealm(VirtualFrame) 1.00 | 18 29 24 1 1 0 1 | 1 1 0 0 0 0 0 | 41 0 js:test.js:1:0-100
FunctionBodyNode.execute(VirtualFrame) 1.00 | 17 28 24 1 1 0 1 | 0 0 0 0 0 0 0 | 1 js:test.js:1:0-100
AbstractBlockNode.execute(VirtualFrame) 1.00 | 17 28 24 1 1 0 1 | 0 0 0 0 0 0 0 |
AbstractBlockNode.executeVoid(VirtualFrame, JavaScriptNode, I, I) 1.00 | 16 20 16 1 1 0 0 | 1 1 0 0 0 0 0 | 50 2 js:test.js:1:16-100
WhileNode.executeVoid(VirtualFrame) 1.00 | 15 19 16 1 1 0 0 | 0 0 0 0 0 0 0 | 7 js:test.js:3:35-84
OptimizedOSRLoopNode.execute(VirtualFrame) 101.00 | 15 19 16 1 1 0 0 | 4 3 2 0 1 0 0 | 46 8 js:test.js:3:35-84
RepeatingNode.executeRepeatingWithValue(VirtualFrame) 1.00 | 11 16 14 1 0 0 0 | 1 1 0 0 0 0 0 | 100 9 js:test.js:3:35-84
WhileNode$WhileDoRepeatingNode.executeRepeating(VirtualFrame) 101.00 | 10 15 14 1 0 0 0 | 3 2 1 1 0 0 0 | 63 9 js:test.js:3:35-84
AbstractRepeatingNode.executeCondition(VirtualFrame) 101.00 | 1 1 1 0 0 0 0 | 0 0 0 0 0 0 0 | 9 js:test.js:3:35-84
StatementNode.executeConditionAsBoolean(VirtualFrame, JavaScriptNode) 101.00 | 1 1 1 0 0 0 0 | 0 0 0 0 0 0 0 |
JSLessThanNodeGen.executeBoolean(VirtualFrame) 101.00 | 1 1 1 0 0 0 0 | 0 0 0 0 0 0 0 | 10 js:test.js:3:51-58
JSLessThanNodeGen.executeBoolean_int_int0(VirtualFrame, J) 101.00 | 1 1 1 0 0 0 0 | 0 0 0 0 0 0 0 |
JSLessThanNode.doInt(I, I) 101.00 | 1 1 1 0 0 0 0 | 1 1 1 0 0 0 0 | 59 10 js:test.js:3:51-58
AbstractRepeatingNode.executeBody(VirtualFrame) 101.00 | 6 12 12 0 0 0 0 | 0 0 0 0 0 0 0 | 9 js:test.js:3:35-84
AbstractBlockNode.executeVoid(VirtualFrame) 101.00 | 6 12 12 0 0 0 0 | 0 0 0 0 0 0 0 |
AbstractBlockNode.executeVoid(VirtualFrame, JavaScriptNode, I, I) 101.00 | 6 12 12 0 0 0 0 | 0 0 0 0 0 0 0 | 13 js:test.js:3:35-84
JSWriteCurrentFrameSlotNodeGen.executeVoid(VirtualFrame) 101.00 | 6 12 12 0 0 0 0 | 0 0 0 0 0 0 0 | 14 js:test.js:4:71-79
JSWriteCurrentFrameSlotNodeGen.executeInt(VirtualFrame) 101.00 | 3 6 6 0 0 0 0 | 0 0 0 0 0 0 0 | 14 js:test.js:4:71-79
JSAddNodeGen.executeInt(VirtualFrame) 101.00 | 3 6 6 0 0 0 0 | 0 0 0 0 0 0 0 |
JSAddNode.doInt(I, I) 101.00 | 3 6 6 0 0 0 0 | 0 0 0 0 0 0 0 |
Math.addExact(I, I) 100.00 | 3 6 6 0 0 0 0 | 3 6 6 0 0 0 0 | 75 15 js:test.js:4:71-74
LocalVarPostfixIncNodeGen.executeInt(VirtualFrame) 101.00 | 3 6 6 0 0 0 0 | 0 0 0 0 0 0 0 | 18 js:test.js:3:60-63
LocalVarPostfixIncNode.doInt(Frame) 101.00 | 3 6 6 0 0 0 0 | 0 0 0 0 0 0 0 | 18 js:test.js:3:60-63
LocalVarIncNode$IncOp.doInt(I) 101.00 | 3 6 6 0 0 0 0 | 0 0 0 0 0 0 0 |
Math.addExact(I, I) 100.00 | 3 6 6 0 0 0 0 | 3 6 6 0 0 0 0 | 85
AbstractBlockNode.executeGeneric(VirtualFrame, JavaScriptNode, I, I) 1.00 | 1 8 8 0 0 0 1 | 0 0 0 0 0 0 0 | 2 js:test.js:1:16-100
ReturnNode$TerminalPositionReturnNode.execute(VirtualFrame) 1.00 | 1 8 8 0 0 0 1 | 0 0 0 0 0 0 0 | 20 js:test.js:6:87-98
JSReadCurrentFrameSlotNodeGen.execute(VirtualFrame) 1.00 | 1 8 8 0 0 0 1 | 0 0 0 0 0 0 0 |
Integer.valueOf(I) 1.00 | 1 8 8 0 0 0 1 | 1 8 8 0 0 0 1 | 139 21 js:test.js:6:94-97
--engine.TraceNodeExpansion=truffleTier
命令打印所有扩展 Truffle 节点的树,并在每次编译后显示统计信息。此视图按节点 ID 对方法扩展树进行分组
[engine] Expansion tree for test after truffleTier:
Name Frequency | Count Size Cycles Ifs Loops Invokes Allocs | Self Count Size Cycles Ifs Loops Invokes Allocs | IRNode ASTNode Lang:File:Line:Chars
<call-root> 1.00 | 64 72 42 1 1 0 1 | 44 41 18 0 0 0 0 | 0
FunctionRootNode 1.00 | 20 31 24 1 1 0 1 | 3 3 0 0 0 0 0 | 34 0 js:test.js:1:0-100
FunctionBodyNode 1.00 | 17 28 24 1 1 0 1 | 0 0 0 0 0 0 0 | 1 js:test.js:1:0-100
ExprBlockNode 1.00 | 17 28 24 1 1 0 1 | 1 1 0 0 0 0 0 | 50 2 js:test.js:1:16-100
WhileNode 1.00 | 15 19 16 1 1 0 0 | 0 0 0 0 0 0 0 | 7 js:test.js:3:35-84
OptimizedOSRLoopNode$OptimizedDefaultOSRLoopNode 101.00 | 15 19 16 1 1 0 0 | 4 3 2 0 1 0 0 | 46 8 js:test.js:3:35-84
WhileNode$WhileDoRepeatingNode 101.00 | 11 16 14 1 0 0 0 | 4 3 1 1 0 0 0 | 100 9 js:test.js:3:35-84
JSLessThanNodeGen 101.00 | 1 1 1 0 0 0 0 | 1 1 1 0 0 0 0 | 59 10 js:test.js:3:51-58
VoidBlockNode 101.00 | 6 12 12 0 0 0 0 | 0 0 0 0 0 0 0 | 13 js:test.js:3:35-84
JSWriteCurrentFrameSlotNodeGen 101.00 | 6 12 12 0 0 0 0 | 0 0 0 0 0 0 0 | 14 js:test.js:4:71-79
JSAddNodeGen 100.00 | 3 6 6 0 0 0 0 | 3 6 6 0 0 0 0 | 75 15 js:test.js:4:71-74
LocalVarPostfixIncNodeGen 100.00 | 3 6 6 0 0 0 0 | 3 6 6 0 0 0 0 | 85 18 js:test.js:3:60-63
ReturnNode$TerminalPositionReturnNode 1.00 | 1 8 8 0 0 0 1 | 0 0 0 0 0 0 0 | 20 js:test.js:6:87-98
JSReadCurrentFrameSlotNodeGen 1.00 | 1 8 8 0 0 0 1 | 1 8 8 0 0 0 1 | 139 21 js:test.js:6:94-97
--engine.MethodExpansionStatistics=truffleTier
命令在运行结束时打印部分评估期间扩展 Java 方法的统计信息。这对于检测意外生成过多或某些 Graal 节点的代码很有用
[engine] Method expansion statistics after truffleTier:
Name Count IR Nodes (min avg max) Size (min avg max) Cycles (min avg max) Ifs Loops Invokes Allocs | Max IRNode ASTNode Unit:Lang:File:Line:Chars
<no-source-position> 1 212 (212 212.0 212) 117 (117 117.0 117) 0 (0 0.0 0) 0 0 0 0 | 0 mandelbrot
OptimizedOSRLoopNode.execute(VirtualFrame) 4 13 (0 3.3 5) 9 (0 2.3 3) 6 (0 1.5 2) 0 3 0 0 | 172 60 mandelbrot:js:mandelbrot.js:68:2589-2888
Math.addExact(I, I) 4 12 (3 3.0 3) 24 (6 6.0 6) 24 (6 6.0 6) 0 0 0 0 | 485 103 mandelbrot:js:mandelbrot.js:80:2874-2875
WhileNode$WhileDoRepeatingNode.executeRepeating(VirtualFrame) 4 9 (0 2.3 3) 6 (0 1.5 2) 3 (0 0.8 1) 3 0 0 0 | 88 17 mandelbrot:js:mandelbrot.js:57:2374-3431
JSTypes.intToDouble(I) 7 7 (1 1.0 1) 7 (1 1.0 1) 24 (0 3.4 8) 0 0 0 0 | 144 41 mandelbrot:js:mandelbrot.js:62:2478-2486
OptimizedCallTarget.castArgumentsImpl(Object, Class) 1 7 (7 7.0 7) 25 (25 25.0 25) 24 (24 24.0 24) 0 0 0 0 | 12 mandelbrot
JSWriteCurrentFrameSlotNodeGen.executeVoid(VirtualFrame) 6 6 (0 1.0 3) 4 (0 0.7 2) 2 (0 0.3 1) 2 0 0 0 | 563 46 mandelbrot:js:mandelbrot.js:64:2519-2544
AbstractBlockNode.executeVoid(VirtualFrame, JavaScriptNode, I, I) 8 6 (0 0.8 4) 6 (0 0.8 4) 0 (0 0.0 0) 0 0 0 0 | 177 39 mandelbrot:js:mandelbrot.js:61:2459-3416
Math.multiplyExact(I, I) 4 6 (1 1.5 2) 12 (2 3.0 4) 20 (4 5.0 6) 0 0 0 0 | 155 49 mandelbrot:js:mandelbrot.js:64:2529-2534
OptimizedCallTarget.injectArgumentsProfile(Object) 1 4 (4 4.0 4) 3 (3 3.0 3) 0 (0 0.0 0) 0 0 0 0 | 5 mandelbrot
JSMultiplyNode.doDouble(D, D) 4 4 (1 1.0 1) 4 (1 1.0 1) 8 (2 2.0 2) 0 0 0 0 | 280 75 mandelbrot:js:mandelbrot.js:70:2657-2663
IfNode.executeVoid(VirtualFrame) 3 3 (0 1.0 3) 2 (0 0.7 2) 1 (0 0.3 1) 1 0 0 0 | 606 126 mandelbrot:js:mandelbrot.js:93:3240-3397
Math.subtractExact(I, I) 1 3 (3 3.0 3) 6 (6 6.0 6) 6 (6 6.0 6) 0 0 0 0 | 589 129 mandelbrot:js:mandelbrot.js:93:3249-3257
JSSubtractNode.doDouble(D, D) 3 3 (1 1.0 1) 3 (1 1.0 1) 3 (1 1.0 1) 0 0 0 0 | 167 47 mandelbrot:js:mandelbrot.js:64:2528-2544
JSLessThanNode.doInt(I, I) 3 3 (1 1.0 1) 3 (1 1.0 1) 3 (1 1.0 1) 0 0 0 0 | 187 62 mandelbrot:js:mandelbrot.js:68:2596-2602
JSAddNode.doDouble(D, D) 3 3 (1 1.0 1) 3 (1 1.0 1) 3 (1 1.0 1) 0 0 0 0 | 263 67 mandelbrot:js:mandelbrot.js:69:2623-2639
JSDivideNode.doDouble(D, D) 2 2 (1 1.0 1) 2 (1 1.0 1) 64 (32 32.0 32) 0 0 0 0 | 165 48 mandelbrot:js:mandelbrot.js:64:2528-2540
JSBitwiseXorNode.doInteger(I, I) 2 2 (1 1.0 1) 2 (1 1.0 1) 2 (1 1.0 1) 0 0 0 0 | 575 119 mandelbrot:js:mandelbrot.js:90:3170-3173
JSEqualNode.doInt(I, I) 2 2 (1 1.0 1) 2 (1 1.0 1) 2 (1 1.0 1) 0 0 0 0 | 592 127 mandelbrot:js:mandelbrot.js:93:3244-3257
RepeatingNode.executeRepeatingWithValue(VirtualFrame) 4 1 (0 0.3 1) 1 (0 0.3 1) 0 (0 0.0 0) 0 0 0 0 | 499 61 mandelbrot:js:mandelbrot.js:68:2589-2888
FunctionRootNode.executeInRealm(VirtualFrame) 1 1 (1 1.0 1) 1 (1 1.0 1) 0 (0 0.0 0) 0 0 0 0 | 53 0 mandelbrot:js:mandelbrot.js:50:2279-3447
OptimizedCallTarget.profiledPERoot(Object) 1 1 (1 1.0 1) 2 (2 2.0 2) 2 (2 2.0 2) 0 0 0 0 | 737 mandelbrot
PolyglotReferences$WeakSingleContext.get() 1 1 (1 1.0 1) 1 (1 1.0 1) 0 (0 0.0 0) 0 0 0 0 | 41 mandelbrot
JSLeftShiftNode.doInteger(I, I) 1 1 (1 1.0 1) 1 (1 1.0 1) 1 (1 1.0 1) 0 0 0 0 | 619 134 mandelbrot:js:mandelbrot.js:94:3269-3277
Integer.intValue() 1 1 (1 1.0 1) 2 (2 2.0 2) 2 (2 2.0 2) 0 0 0 0 | 50 4 mandelbrot:js:1:0-0
JSSubtractNode.doInt(I, I) 2 1 (0 0.5 1) 1 (0 0.5 1) 1 (0 0.5 1) 0 0 0 0 | 940 136 mandelbrot:js:mandelbrot.js:94:3282-3295
JSLeftShiftConstantNode.doInteger(I) 1 1 (1 1.0 1) 1 (1 1.0 1) 1 (1 1.0 1) 0 0 0 0 | 527 107 mandelbrot:js:mandelbrot.js:83:2907-2922
JSSubtractNodeGen.executeDouble(VirtualFrame) 3 1 (0 0.3 1) 1 (0 0.3 1) 0 (0 0.0 0) 0 0 0 0 | 33 47 mandelbrot:js:mandelbrot.js:64:2528-2544
JSReadCurrentFrameSlotNodeGen.executeInt(VirtualFrame) 1 1 (1 1.0 1) 1 (1 1.0 1) 0 (0 0.0 0) 0 0 0 0 | 74 19 mandelbrot:js:mandelbrot.js:57:2381-2382
Integer.valueOf(I) 1 1 (1 1.0 1) 8 (8 8.0 8) 8 (8 8.0 8) 0 0 0 1 | 939 154 mandelbrot:js:mandelbrot.js:105:3442-3445
JSBitwiseOrNode.doInteger(I, I) 1 1 (1 1.0 1) 1 (1 1.0 1) 1 (1 1.0 1) 0 0 0 0 | 532 106 mandelbrot:js:mandelbrot.js:83:2907-2931
JSGreaterThanNode.doDouble(D, D) 1 1 (1 1.0 1) 1 (1 1.0 1) 2 (2 2.0 2) 0 0 0 0 | 461 93 mandelbrot:js:mandelbrot.js:76:2800-2815
OptimizedCallTarget.unsafeCast(Object, Class, Z, Z, Z) 1 1 (1 1.0 1) 0 (0 0.0 0) 0 (0 0.0 0) 0 0 0 0 | 10 mandelbrot
JavaScriptRealmBoundaryRootNode.execute(VirtualFrame) 1 1 (1 1.0 1) 1 (1 1.0 1) 0 (0 0.0 0) 0 0 0 0 | 38 mandelbrot
JSLeftShiftConstantNodeGen.executeInt(VirtualFrame) 1 1 (1 1.0 1) 1 (1 1.0 1) 0 (0 0.0 0) 0 0 0 0 | 36 107 mandelbrot:js:mandelbrot.js:83:2907-2922
DualNode.execute(VirtualFrame) 1 0 (0 0.0 0) 0 (0 0.0 0) 0 (0 0.0 0) 0 0 0 0 | 2 mandelbrot:js:mandelbrot.js:50:2279-3447
...
--engine.NodeExpansionStatistics=truffleTier
命令在运行结束时打印部分评估期间扩展 Truffle 节点的统计信息。这对于检测意外生成过多或某些 Graal 节点的代码很有用。它还显示了在编译期间观察到的各个特化组合
[engine] Node expansion statistics after truffleTier:
Name Count IR Nodes (min avg max) Size (min avg max) Cycles (min avg max) Ifs Loops Invokes Allocs | Max IRNode ASTNode Unit:Lang:File:Line:Chars
<call-root> 1 226 (226 226.0 226) 148 (148 148.0 148) 26 (26 26.0 26) 0 0 0 0 | 0 mandelbrot
OptimizedOSRLoopNode$OptimizedDefaultOSRLoopNode 4 13 (0 3.3 5) 9 (0 2.3 3) 6 (0 1.5 2) 0 3 0 0 | 172 60 mandelbrot:js:mandelbrot.js:68:2589-2888
JSAddConstantRightNumberNodeGen 4 12 (3 3.0 3) 24 (6 6.0 6) 24 (6 6.0 6) 0 0 0 0 | 485 103 mandelbrot:js:mandelbrot.js:80:2874-2875
[doInt(I)] 4 12 (3 3.0 3) 24 (6 6.0 6) 24 (6 6.0 6) 0 0 0 0 | 485 103 mandelbrot:js:mandelbrot.js:80:2874-2875
JSMultiplyNodeGen 6 11 (1 1.8 3) 17 (1 2.8 6) 28 (2 4.7 10) 0 0 0 0 | 155 49 mandelbrot:js:mandelbrot.js:64:2529-2534
[doIntALargerZero(I, I), doIntBLargerZero(I, I)] 2 6 (3 3.0 3) 12 (6 6.0 6) 20 (10 10.0 10) 0 0 0 0 | 155 49 mandelbrot:js:mandelbrot.js:64:2529-2534
[doDouble(D, D)] 4 5 (1 1.3 2) 5 (1 1.3 2) 8 (2 2.0 2) 0 0 0 0 | 275 75 mandelbrot:js:mandelbrot.js:70:2657-2663
WhileNode$WhileDoRepeatingNode 4 10 (0 2.5 4) 7 (0 1.8 3) 3 (0 0.8 1) 3 0 0 0 | 499 61 mandelbrot:js:mandelbrot.js:68:2589-2888
JSSubtractNodeGen 5 9 (1 1.8 3) 12 (1 2.4 6) 10 (1 2.0 6) 0 0 0 0 | 589 129 mandelbrot:js:mandelbrot.js:93:3249-3257
[doDouble(D, D)] 3 5 (1 1.7 2) 5 (1 1.7 2) 3 (1 1.0 1) 0 0 0 0 | 33 47 mandelbrot:js:mandelbrot.js:64:2528-2544
[doInt(I, I)] 2 4 (1 2.0 3) 7 (1 3.5 6) 7 (1 3.5 6) 0 0 0 0 | 589 129 mandelbrot:js:mandelbrot.js:93:3249-3257
JSWriteCurrentFrameSlotNodeGen 18 7 (0 0.4 3) 5 (0 0.3 2) 2 (0 0.1 1) 2 0 0 0 | 563 46 mandelbrot:js:mandelbrot.js:64:2519-2544
JSDivideNodeGen 2 5 (2 2.5 3) 5 (2 2.5 3) 88 (40 44.0 48) 0 0 0 0 | 158 48 mandelbrot:js:mandelbrot.js:64:2528-2540
[doDouble(D, D)] 2 5 (2 2.5 3) 5 (2 2.5 3) 88 (40 44.0 48) 0 0 0 0 | 158 48 mandelbrot:js:mandelbrot.js:64:2528-2540
VoidBlockNode 7 5 (0 0.7 4) 5 (0 0.7 4) 0 (0 0.0 0) 0 0 0 0 | 177 39 mandelbrot:js:mandelbrot.js:61:2459-3416
JSAddNodeGen 3 3 (1 1.0 1) 3 (1 1.0 1) 3 (1 1.0 1) 0 0 0 0 | 263 67 mandelbrot:js:mandelbrot.js:69:2623-2639
[doDouble(D, D)] 3 3 (1 1.0 1) 3 (1 1.0 1) 3 (1 1.0 1) 0 0 0 0 | 263 67 mandelbrot:js:mandelbrot.js:69:2623-2639
JSLessThanNodeGen 3 3 (1 1.0 1) 3 (1 1.0 1) 3 (1 1.0 1) 0 0 0 0 | 187 62 mandelbrot:js:mandelbrot.js:68:2596-2602
[doInt(I, I)] 3 3 (1 1.0 1) 3 (1 1.0 1) 3 (1 1.0 1) 0 0 0 0 | 187 62 mandelbrot:js:mandelbrot.js:68:2596-2602
IfNode 4 3 (0 0.8 3) 2 (0 0.5 2) 1 (0 0.3 1) 1 0 0 0 | 606 126 mandelbrot:js:mandelbrot.js:93:3240-3397
JSGreaterThanNodeGen 1 2 (2 2.0 2) 2 (2 2.0 2) 2 (2 2.0 2) 0 0 0 0 | 460 93 mandelbrot:js:mandelbrot.js:76:2800-2815
[doDouble(D, D)] 1 2 (2 2.0 2) 2 (2 2.0 2) 2 (2 2.0 2) 0 0 0 0 | 460 93 mandelbrot:js:mandelbrot.js:76:2800-2815
JSBitwiseXorNodeGen 2 2 (1 1.0 1) 2 (1 1.0 1) 2 (1 1.0 1) 0 0 0 0 | 575 119 mandelbrot:js:mandelbrot.js:90:3170-3173
[doInteger(I, I)] 2 2 (1 1.0 1) 2 (1 1.0 1) 2 (1 1.0 1) 0 0 0 0 | 575 119 mandelbrot:js:mandelbrot.js:90:3170-3173
JSLeftShiftConstantNodeGen 1 2 (2 2.0 2) 2 (2 2.0 2) 1 (1 1.0 1) 0 0 0 0 | 36 107 mandelbrot:js:mandelbrot.js:83:2907-2922
[doInteger(I)] 1 2 (2 2.0 2) 2 (2 2.0 2) 1 (1 1.0 1) 0 0 0 0 | 36 107 mandelbrot:js:mandelbrot.js:83:2907-2922
JSReadCurrentFrameSlotNodeGen 2 2 (1 1.0 1) 9 (1 4.5 8) 8 (0 4.0 8) 0 0 0 1 | 74 19 mandelbrot:js:mandelbrot.js:57:2381-2382
JSEqualNodeGen 2 2 (1 1.0 1) 2 (1 1.0 1) 2 (1 1.0 1) 0 0 0 0 | 592 127 mandelbrot:js:mandelbrot.js:93:3244-3257
[doInt(I, I)] 2 2 (1 1.0 1) 2 (1 1.0 1) 2 (1 1.0 1) 0 0 0 0 | 592 127 mandelbrot:js:mandelbrot.js:93:3244-3257
FunctionRootNode 1 2 (2 2.0 2) 2 (2 2.0 2) 0 (0 0.0 0) 0 0 0 0 | 53 0 mandelbrot:js:mandelbrot.js:50:2279-3447
ExprBlockNode 1 1 (1 1.0 1) 1 (1 1.0 1) 0 (0 0.0 0) 0 0 0 0 | 69 5 mandelbrot:js:mandelbrot.js:50:2305-3447
JSBitwiseOrNodeGen 1 1 (1 1.0 1) 1 (1 1.0 1) 1 (1 1.0 1) 0 0 0 0 | 532 106 mandelbrot:js:mandelbrot.js:83:2907-2931
[doInteger(I, I)] 1 1 (1 1.0 1) 1 (1 1.0 1) 1 (1 1.0 1) 0 0 0 0 | 532 106 mandelbrot:js:mandelbrot.js:83:2907-2931
AccessIndexedArgumentNode 1 1 (1 1.0 1) 2 (2 2.0 2) 2 (2 2.0 2) 0 0 0 0 | 50 4 mandelbrot:js:1:0-0
JSLeftShiftNodeGen 2 1 (0 0.5 1) 1 (0 0.5 1) 1 (0 0.5 1) 0 0 0 0 | 619 134 mandelbrot:js:mandelbrot.js:94:3269-3277
[doInteger(I, I)] 1 1 (1 1.0 1) 1 (1 1.0 1) 1 (1 1.0 1) 0 0 0 0 | 619 134 mandelbrot:js:mandelbrot.js:94:3269-3277
<unknown> 1 0 (0 0.0 0) 0 (0 0.0 0) 0 (0 0.0 0) 0 0 0 0 | 134 mandelbrot:js:mandelbrot.js:94:3269-3277
FunctionBodyNode 1 0 (0 0.0 0) 0 (0 0.0 0) 0 (0 0.0 0) 0 0 0 0 | 1 mandelbrot:js:mandelbrot.js:50:2279-3447
ReturnNode$TerminalPositionReturnNode 1 0 (0 0.0 0) 0 (0 0.0 0) 0 (0 0.0 0) 0 0 0 0 | 153 mandelbrot:js:mandelbrot.js:105:3435-3445
DualNode 1 0 (0 0.0 0) 0 (0 0.0 0) 0 (0 0.0 0) 0 0 0 0 | 2 mandelbrot:js:mandelbrot.js:50:2279-3447
WhileNode 4 0 (0 0.0 0) 0 (0 0.0 0) 0 (0 0.0 0) 0 0 0 0 | 15 mandelbrot:js:mandelbrot.js:57:2374-3431
DirectBreakTargetNode 2 0 (0 0.0 0) 0 (0 0.0 0) 0 (0 0.0 0) 0 0 0 0 | 14 mandelbrot:js:mandelbrot.js:50:2305-3447
--engine.InstrumentBoundaries
命令会在进程结束时打印从编译代码进行的运行时调用(@TruffleBoundary
)的相关信息。这些调用会导致对象逃逸(对进一步优化来说是黑盒),通常应该尽量减少。另请参阅分支插桩指南,了解更多关于插桩分支和边界的信息。
Execution profile (sorted by hotness)
=====================================
0: *******************************************************************************
1:
com.oracle.truffle.js.nodes.binary.JSAddNode.doStringInt(JSAddNode.java:177) [bci: 2]
[0] count = 22525269
com.oracle.truffle.js.builtins.ConstructorBuiltins$ConstructDateNode.constructDateZero(ConstructorBuiltins.java:837) [bci: 6]
[1] count = 69510
--engine.InstrumentBranches
命令会在进程结束时打印编译代码中分支使用情况的信息。
Execution profile (sorted by hotness)
=====================================
2: ***************
1: **************
5: *************
4: ************
3: *********
10: **
8: *
9: *
14: *
...
com.oracle.truffle.js.nodes.access.PropertyGetNode.getValueOrDefault(PropertyGetNode.java:301) [bci: 55]
[2] state = BOTH(if=36486564#, else=44603498#)
com.oracle.truffle.js.nodes.control.IfNode.execute(IfNode.java:158) [bci: 12]
[1] state = BOTH(if=72572593#, else=1305851#)
com.oracle.truffle.js.nodes.function.JSFunctionCallNode.executeCall(JSFunctionCallNode.java:233) [bci: 18]
[5] state = BOTH(if=38703322#, else=32550439#)
com.oracle.truffle.js.nodes.access.PropertyCacheNode$PrototypeShapeCheckNode.accept(PropertyCacheNode.java:364) [bci: 4]
[4] state = ELSE(if=0#, else=64094316#)
com.oracle.truffle.js.nodes.control.WhileNode$WhileDoRepeatingNode.executeRepeating(WhileNode.java:230) [bci: 5]
[3] state = BOTH(if=44392142#, else=7096299#)
...
--engine.SpecializationStatistics
命令会打印关于节点类及其 Truffle DSL 特化使用情况的详细直方图。有关如何使用它的教程,请参阅特化统计信息
。
注意:特化统计信息需要重新编译解释器。
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Name Instances Executions Executions per instance
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| JSWriteCurrentFrameSlotNodeGen 8 (17%) 18 (12%) Min= 1 Avg= 2.25 Max= 5 MaxNode= test.js~5-7:76-128
| doBoolean <boolean> 1 (13%) 1 (6%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~4:52-71
| doInt <int> 1 (13%) 1 (6%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~5-7:76-128
| doSafeIntegerInt 0 (0%) 0 (0%) Min= 0 Avg= 0.00 Max= 0 MaxNode= -
| doSafeInteger 0 (0%) 0 (0%) Min= 0 Avg= 0.00 Max= 0 MaxNode= -
| doLong 0 (0%) 0 (0%) Min= 0 Avg= 0.00 Max= 0 MaxNode= -
| doDouble 0 (0%) 0 (0%) Min= 0 Avg= 0.00 Max= 0 MaxNode= -
| doObject 7 (88%) 16 (89%) Min= 1 Avg= 2.29 Max= 5 MaxNode= test.js~5-7:76-128
| <DynamicObjectBasic> 6 (86%) 12 (75%) Min= 1 Avg= 2.00 Max= 5 MaxNode= test.js~5-7:76-128
| <IteratorRecord> 1 (14%) 1 (6%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~1-8:16-130
| <String> 2 (29%) 2 (13%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~5-7:76-128
| <Integer> 1 (14%) 1 (6%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~6:105-123
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------
| [doBoolean] 1 (13%) 1 (6%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~4:52-71
| [doInt, doObject] 1 (13%) 4 (22%) Min= 4 Avg= 4.00 Max= 4 MaxNode= test.js~5-7:76-128
| doInt 1 (100%) 1 (25%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~5-7:76-128
| doObject 1 (100%) 3 (75%) Min= 3 Avg= 3.00 Max= 3 MaxNode= test.js~5-7:76-128
| [doObject] 6 (75%) 13 (72%) Min= 1 Avg= 2.17 Max= 5 MaxNode= test.js~5-7:76-128
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Name Instances Executions Executions per instance
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| JSReadCurrentFrameSlotNodeGen 8 (17%) 25 (17%) Min= 1 Avg= 3.13 Max= 5 MaxNode= test.js~5-7:76-128
| doBoolean 0 (0%) 0 (0%) Min= 0 Avg= 0.00 Max= 0 MaxNode= -
| doInt <no-args> 1 (13%) 1 (4%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~5:81-87
| doDouble 0 (0%) 0 (0%) Min= 0 Avg= 0.00 Max= 0 MaxNode= -
| doObject <no-args> 8 (100%) 24 (96%) Min= 1 Avg= 3.00 Max= 5 MaxNode= test.js~5-7:76-128
| doSafeInteger 0 (0%) 0 (0%) Min= 0 Avg= 0.00 Max= 0 MaxNode= -
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------
| [doInt, doObject] 1 (13%) 4 (16%) Min= 4 Avg= 4.00 Max= 4 MaxNode= test.js~5:81-87
| doInt 1 (100%) 1 (25%) Min= 1 Avg= 1.00 Max= 1 MaxNode= test.js~5:81-87
| doObject 1 (100%) 3 (75%) Min= 3 Avg= 3.00 Max= 3 MaxNode= test.js~5:81-87
| [doObject] 7 (88%) 21 (84%) Min= 1 Avg= 3.00 Max= 5 MaxNode= test.js~5-7:76-128
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
控制编译内容 #
为了充分利用前面的选项,请将编译限制在您感兴趣的方法上。
-
--engine.CompileOnly=foo
将编译限制为名称中包含foo
的方法。将其与返回值或采用参数结合使用,以避免代码被编译掉。 -
--engine.CompileImmediately
会在方法运行后立即编译它们。 -
--engine.BackgroundCompilation=false
同步编译,这可以简化操作。 -
--compiler.Inlining=false
禁用内联,这可以使代码更容易理解。 -
--engine.OSR=false
禁用栈上替换(例如,while
循环体的编译),这可以使代码更容易理解。 -
--engine.Compilation=false
完全关闭 Truffle 编译。
调试逆优化 #
有时编译后的代码会意外地逆优化(从 Truffle 编译代码返回到解释器)。以下是一些调查代码为何逆优化的方法。避免重复逆优化对于性能至关重要。
--engine.TraceCompilation
选项会显示带有 [engine] opt deopt
前缀的逆优化,这对于评估是否发生多次逆优化很有用。但是,它没有显示其他详细信息。
使用 FrameInstance#getFrame(READ_WRITE|MATERIALIZE)
从堆栈中实现帧会导致逆优化(但不失效)。可以使用 --engine.TraceDeoptimizeFrame
跟踪这些逆优化。
使用原生镜像时,需要使用 -H:+IncludeNodeSourcePositions
构建原生镜像,才能启用逆优化的堆栈跟踪。默认情况下禁用此功能以节省镜像大小。
在原生镜像上,--engine.TraceTransferToInterpreter
会为任何逆优化打印准确的堆栈跟踪,它实际上与 --vm.XX:+TraceDeoptimization --engine.NodeSourcePositions
相同。由于有堆栈跟踪,这通常是查找逆优化来源的最有效方法。
[Deoptimization initiated
name: String#[]
sp: 0x7ffd7b992710 ip: 0x7f26a8d8079f
reason: TransferToInterpreter action: InvalidateReprofile
debugId: 25 speculation: jdk.vm.ci.meta.SpeculationLog$NoSpeculationReason@13dbed9e
stack trace that triggered deoptimization:
at org.truffleruby.core.string.StringNodesFactory$StringSubstringPrimitiveNodeFactory$StringSubstringPrimitiveNodeGen.execute(StringNodesFactory.java:12760)
at org.truffleruby.core.string.StringNodes$GetIndexNode.substring(StringNodes.java:836)
at org.truffleruby.core.string.StringNodes$GetIndexNode.getIndex(StringNodes.java:650)
at org.truffleruby.core.string.StringNodesFactory$GetIndexNodeFactory$GetIndexNodeGen.execute(StringNodesFactory.java:1435)
at org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:53)
at com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:632)
at com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:603)
[Deoptimization of frame
name: String#[]
sp: 0x7ffd7b992710 ip: 0x7f26a8d8079f
stack trace where execution continues:
at org.truffleruby.core.string.StringNodesFactory$StringSubstringPrimitiveNodeFactory$StringSubstringPrimitiveNodeGen.execute(StringNodesFactory.java:12760) bci 99 return address 0x4199a1d
at org.truffleruby.core.string.StringNodes$GetIndexNode.substring(StringNodes.java:836) bci 32 duringCall return address 0x41608e0
at org.truffleruby.core.string.StringNodes$GetIndexNode.getIndex(StringNodes.java:650) bci 25 duringCall return address 0x415f197
at org.truffleruby.core.string.StringNodesFactory$GetIndexNodeFactory$GetIndexNodeGen.execute(StringNodesFactory.java:1435) bci 109 duringCall return address 0x4182391
at org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:53) bci 14 duringCall return address 0x4239a29
at com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:632) bci 9 duringCall return address 0x3f1c4c9
at com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:603) bci 37 duringCall return address 0x3f1d965
]
]
在 HotSpot 上,--engine.TraceTransferToInterpreter
仅通过 CompilerDirectives.transferToInterpreterAndInvalidate()
或 CompilerDirectives.transferToInterpreter()
打印显式逆优化的堆栈跟踪。如果逆优化是由其他原因引起的,则报告的位置可能不正确。在这种情况下,它会报告下一个 CompilerDirectives.transferToInterpreter*
调用的堆栈跟踪,即使它不是原因。
[engine] transferToInterpreter at
BinaryConstraint.output(../../../../4dev/js-benchmarks/octane-deltablue.js:416)
Constraint.satisfy(../../../../4dev/js-benchmarks/octane-deltablue.js:183)
Planner.incrementalAdd(../../../../4dev/js-benchmarks/octane-deltablue.js:597) <split-609bcfb6>
Constraint.addConstraint(../../../../4dev/js-benchmarks/octane-deltablue.js:165) <split-7d94beb9>
UnaryConstraint(../../../../4dev/js-benchmarks/octane-deltablue.js:219) <split-560348e6>
Function.prototype.call(<builtin>:1) <split-1df8b5b8>
EditConstraint(../../../../4dev/js-benchmarks/octane-deltablue.js:315) <split-23202fce>
...
com.oracle.truffle.api.CompilerDirectives.transferToInterpreterAndInvalidate(CompilerDirectives.java:90)
com.oracle.truffle.js.nodes.access.PropertyCacheNode.deoptimize(PropertyCacheNode.java:1269)
com.oracle.truffle.js.nodes.access.PropertyGetNode.getValueOrDefault(PropertyGetNode.java:305)
com.oracle.truffle.js.nodes.access.PropertyGetNode.getValueOrUndefined(PropertyGetNode.java:191)
com.oracle.truffle.js.nodes.access.PropertyNode.executeWithTarget(PropertyNode.java:153)
com.oracle.truffle.js.nodes.access.PropertyNode.execute(PropertyNode.java:140)
...
在 HotSpot 上,--vm.XX:+UnlockDiagnosticVMOptions --vm.XX:+DebugNonSafepoints --vm.XX:+TraceDeoptimization
会打印所有逆优化事件(但不打印堆栈跟踪),无论代码是由 Truffle 还是传统编译器编译的。TraceDeoptimization
选项可能需要使用 fastdebug
JDK。
Uncommon trap bci=9 pc=0x00000001097f2235, relative_pc=501, method=com.oracle.truffle.js.nodes.access.PropertyNode.executeInt(Ljava/lang/Object;Ljava/lang/Object;)I, debug_id=0
Uncommon trap occurred in com.oracle.truffle.runtime.OptimizedCallTarget::profiledPERoot compiler=JVMCI compile_id=2686 (JVMCI: installed code name=BinaryConstraint.output#2) (@0x00000001097f2235) thread=5891 reason=transfer_to_interpreter action=reinterpret unloaded_class_index=-1 debug_id=0
最后,在原生镜像上,--vm.XX:+TraceDeoptimizationDetails
会打印其他信息。
[Deoptimization initiated
name: BinaryConstraint.output
sp: 0x7ffee7324d90 ip: 0x1126c51a8
reason: TransferToInterpreter action: InvalidateReprofile
debugId: 3 speculation: jdk.vm.ci.meta.SpeculationLog$NoSpeculationReason@10f942aa0
[Deoptimization of frame
name: BinaryConstraint.output
sp: 0x7ffee7324d90 ip: 0x1126c51a8
stack trace where execution continues:
at com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:475) bci 0 return address 0x10aab9e5e
com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:475)
bci: 0 deoptMethodOffset: 35524160 deoptMethod: 0x10aab9e40 return address: 0x10aab9e5e offset: 0
slot 0 kind: Object value: com.oracle.svm.truffle.api.SubstrateOptimizedCallTarget@0x112cbbaa0 offset: 64
slot 1 kind: Object value: [Ljava.lang.Object;@0x1144a7db0 offset: 56
]
]
您可能会注意到在这些选项的输出中存在 debugId
或 debug_id
。仅当您还启用转储时,才会设置此 ID,例如,通过 --vm.Djdk.graal.Dump=Truffle:1
(见下文)。在这种情况下,调试 ID 将对应于 IGV 图中节点的 ID。首先,打开相关编译的第一阶段。可以在 IGV 的“在节点中搜索”搜索框中通过 id=NUMBER
搜索该 ID,然后选择“在节点搜索窗口中打开搜索节点 NUMBER”,然后单击“在以下阶段中搜索”按钮。
调试失效 #
当编译的 CallTarget 被丢弃时,就会发生失效。
最常见的原因是
- 显式
CompilerDirectives.transferToInterpreterAndInvalidate()
(内部失效) - 该 CallTarget 使用的某个
Assumption
已失效(外部失效)。使用--engine.TraceAssumptions
跟踪这些失效,以获取更多详细信息。
--engine.TraceCompilation
选项还会显示带有 [engine] opt inv.
前缀的 CallTarget 失效。
理想图可视化工具 #
理想图可视化工具 (IGV) 是一个用于理解 Truffle AST 和 Graal 编译器图的工具。
典型的用法是使用 --vm.Djdk.graal.Dump=Truffle:1 --vm.Djdk.graal.PrintGraph=Network
运行,这将向您显示 Truffle AST、访客语言调用图以及离开 Truffle 阶段时的 Graal 图。如果省略 -Djdk.graal.PrintGraph=Network
标志,则转储文件将放置在 graal_dumps
目录中,您应该在 IGV 中打开该目录。
使用 --vm.Djdk.graal.Dump=Truffle:2
在每个编译器阶段之间转储 Graal 图。
C1 可视化工具 #
C1 可视化工具是一个用于理解 GraalVM 的低级 IR (LIR)、寄存器分配和代码生成阶段的工具。它在此处可用。
典型的用法是 --vm.Djdk.graal.Dump=:3
。文件将放入 graal_dumps
目录中,您应该在 C1 可视化工具中打开该目录。
反汇编器 #
--vm.XX:+UnlockDiagnosticVMOptions --vm.XX:+PrintAssembly
命令组合会打印汇编代码。您需要在 graal/compiler
中使用 mx hsdis
安装 hsdis
,或者从此处手动将其安装到当前目录中。
典型的用法是 --vm.Djdk.graal.Dump --vm.Djdk.graal.PrintBackendCFG=true
。文件将放入 graal_dumps
目录中,您应该在 C1 可视化工具中打开该目录。
与 --vm.XX:TieredStopAtLevel=0
结合使用可以禁用运行时例程的编译,以便更容易找到您的访客语言方法。
注意:您也可以在 C1 可视化工具中查看汇编代码。
这些是优化或调试 Truffle 解释器的最常用和最强大的方法。