comparison doc/design/graal_compiler.tex @ 2576:c59db1f02893

doc: expanded framestate section
author Lukas Stadler <lukas.stadler@jku.at>
date Wed, 04 May 2011 12:58:17 +0200
parents 7aa67f5f3884
children ac2029d0898f
comparison
equal deleted inserted replaced
2571:ac868ecd3cfc 2576:c59db1f02893
23 {\small\textsf{\emph{#2}}} 23 {\small\textsf{\emph{#2}}}
24 \fbox{\bfseries\sffamily\scriptsize }}} 24 \fbox{\bfseries\sffamily\scriptsize }}}
25 25
26 \newcommand\TODO[1]{\mynote{TODO}{#1}} 26 \newcommand\TODO[1]{\mynote{TODO}{#1}}
27 \newcommand\cw[1]{\mynote{CW}{#1}} 27 \newcommand\cw[1]{\mynote{CW}{#1}}
28 \newcommand\ls[1]{\mynote{LS}{#1}}
28 29
29 30
30 31
31 \smartqed % flush right qed marks, e.g. at end of proof 32 \smartqed % flush right qed marks, e.g. at end of proof
32 33
186 \item Graphs are 187 \item Graphs are
187 \end{itemize} 188 \end{itemize}
188 189
189 \subsection{Frame States} 190 \subsection{Frame States}
190 Frame states capture the state of the program, in terms of the source representation (e.g., Java state: local variables, expressions, ...). 191 Frame states capture the state of the program, in terms of the source representation (e.g., Java state: local variables, expressions, ...).
191 Whenever a safepoint is reached or \cwi{why is that an ``or'', both is basically the same} a deoptimization is needed a valid frame state needs to be available. 192 Whenever a safepoint is reached or \cw{why is that an ``or'', both is basically the same} \ls{uncommon traps can be introduced at other points, e.g., at an if branch that isn't compiled} a deoptimization is needed a valid frame state needs to be available.
192 A frame state is valid as long as the program performs only actions that can safely be reexecuted (e.g., operations on local variables, etc.). 193 A frame state is valid as long as the program performs only actions that can safely be reexecuted (e.g., operations on local variables, loads, etc.).
193 Thus, frame states need only be generated for bytecodes that cannot be reexecuted: putfield, astore, invoke, monitorenter/exit, ... 194 Thus, frame states need only be generated for bytecodes that cannot be reexecuted:
194 195
195 Within the node graph a frame state is represented as a node that is fixed between the node that caused it to be generated (data dependency) and the node that invalidates it (control dependency). 196 \begin{itemize}
196 197 \item Array stores: {\tt IASTORE, LASTORE, FASTORE, \\DASTORE, AASTORE, BASTORE, CASTORE, SASTORE}
197 Deopmization nodes are not fixed to a certain frame state node, they can move around freely and will always use the correct frame state. 198 \item Field stores: {\tt PUTSTATIC, PUTFIELD}
198 At some point during the compilation, deoptimization nodes need to be fixed, which means that appropriate data and control dependencies will be inserted so that it can not move outside the scope of the associated frame state. 199 \item Method calls: {\tt INVOKEVIRTUAL, INVOKESPECIAL, \\INVOKESTATIC, INVOKEINTERFACE}
200 \item Memory allocations: {\tt NEW, NEWARRAY, ANEWARRAY, \\MULTIANEWARRAY}
201 \item Exception throw: {\tt ATHROW}
202 \item Synchronization: {\tt MONITORENTER, MONITOREXIT}
203 \end{itemize}
204
205 Within the node graph a frame state is represented as a node that is fixed to the node that caused it to be generated (control dependency).
206
207 \begin{digraphenv}{scale=0.5}{fs1}
208 nodesep="0.02";
209 \nodequadsplit{store1}{ArrayStore}
210 \nodetri{load1}{ArrayLoad}
211 \controllabel{store1:succ1}{load1}
212 \nodequadsplit{store2}{ArrayStore}
213 \control{load1}{store2}
214 end [shape=plaintext, label="...", width="2.0"]
215 store2:succ1:s -> end:n [color=red];
216 %
217 \nodeframestate{fs1}{FrameState}
218 \controllabel{store1:succ2}{fs1}
219 \nodeframestate{fs2}{FrameState}
220 \controllabel{store2:succ2}{fs2}
221 \end{digraphenv}
222
223 FrameStates also have data dependencies on the contents of the state: the local variables and the expression stack.
224
225 \subsection{Deoptimization and Uncommon Traps}
226 Uncommon trap nodes are not fixed to a certain frame state node, they can move around freely and will always use the correct frame state.
227 The node that is guarded by the deoptimization has a data dependency on the trap, and the trap in turn has a data dependency on the condition and on the most distant node that is postdominated by the guarded node.
228
229 \begin{digraphenv}{scale=0.5}{trap1}
230 nodesep="0.02";
231 start [shape=plaintext, label="..."]
232 \control{start}{if}
233 \nodesplit{if}{If}
234 \node{split1}{ControlSplit}
235 \controllabel{if:succ1}{split1}
236 nop [shape=plaintext, label="..."]
237 \control{split1}{nop}
238 %
239 \node{split2}{ControlSplit}
240 \controllabel{if:succ2}{split2}
241 \nodebi{load1}{FieldLoad}
242 \control{split2}{load1}
243 \nodebi{load2}{FieldLoad}
244 \control{load1}{load2}
245 \control{load2}{merge}
246 \node{merge}{Merge}
247 \control{nop}{merge}
248 nop2 [shape=plaintext, label="..."]
249 \control{merge}{nop2}
250 %
251 \nodeconst{o1}{obj1}
252 \nodeconst{o2}{obj2}
253 \datalabel{load1:in2}{o1}
254 \datalabel{load2:in2}{o2}
255 \nodetrap{trap}{Trap}
256 \node{cmpnull}{IsNull}
257 \data{cmpnull}{o2}
258 \datalabel{trap:in2}{cmpnull}
259 \datalabel{load2:in1}{trap}
260 \datalabel{trap:in1}{split2}
261 \end{digraphenv}
262
263 \emph{\small In this example, the second field load is guarded by an uncommon trap, because its receiver might be null (the receiver of the first load is assumed to be non-null).
264 The trap is anchored to the control split, because as soon as this node is executed the field load must be executed as well.
265 In the final scheduling the trap can be placed before or after the first load.}
266
267 Another type of uncommon trap is a guard, which is used to remove branches that have a very low execution frequency from the compiled code.
268
269 \begin{digraphenv}{scale=0.5}{trap2}
270 nodesep="0.02";
271 start [shape=plaintext, label="..."]
272 start2 [shape=plaintext, label=""]
273 start3 [shape=plaintext, label=""]
274 \control{start}{guard}
275 \node{guard}{Guard}
276 \nodebi{load1}{FieldLoad}
277 \control{guard}{load1}
278 \nodebi{load2}{FieldLoad}
279 \control{load1}{load2}
280 \control{load2}{nop}
281 nop [shape=plaintext, label="..."]
282 %
283 \nodetrap{trap}{Trap}
284 \datalabel{trap:in1}{start2}
285 \datalabel{trap:in2}{start3}
286 \data{guard}{trap}
287 \end{digraphenv}
288
289
290 \emph{\small In this example the If from the previous example was replaced by a guard and an uncommon trap.
291 The guard takes the place of the If in the control flow, and is connected to the trap node.
292 The uncommon trap is again anchored to the most distant node of which the If was a postdominator.}
293
294 At some point during the compilation, trap nodes need to be fixed, which means that appropriate data and control dependencies will be inserted so that they cannot move outside the scope of the associated frame state.
199 This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations. 295 This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations.
200 296 A simple algorithm for this removal of FrameStates would be to move all traps as far upwards as possible.
201 Frame states should be represented using a delta-encoding. 297
202 This will make them significantly smaller and will make inlining, etc. much easier. 298
203 In later compilation phases unnecessary frame states might be removed, using a mark-and-merge algorithm. 299 %Frame states should be represented using a delta-encoding.
204 300 %This will make them significantly smaller and will make inlining, etc. much easier.
205 301 %In later compilation phases unnecessary frame states might be removed, using a mark-and-merge algorithm.
206 302
207 \subsection{Graph Building} 303 \subsection{Graph Building}
208 \begin{itemize} 304 \begin{itemize}
209 \item The graph built by the initial parser (called \emph{GraphBuilder}) should be as close to the source representation (bytecode, ...) as possible. 305 \item The graph built by the initial parser (called \emph{GraphBuilder}) should be as close to the source representation (bytecode, ...) as possible.
210 \item All nodes should be able to immediately lower themselves to a machine-level representation. \cw{What is that? You mean every node has x86 specific code that spits out machine code? Hope you are joking...} This allows for easier compiler development, and also leads to a compiler that is very flexible in the amount of optimizations it performs (e.g. recompilation of methods with more aggressive optimizations). 306 \item All nodes should be able to immediately lower themselves to a machine-level representation. \cw{What is that? You mean every node has x86 specific code that spits out machine code? Hope you are joking...} This allows for easier compiler development, and also leads to a compiler that is very flexible in the amount of optimizations it performs (e.g. recompilation of methods with more aggressive optimizations).
220 \nodetri{node3}{phi} 316 \nodetri{node3}{phi}
221 \nodesplit{node4}{if} 317 \nodesplit{node4}{if}
222 \end{digraphenv} 318 \end{digraphenv}
223 319
224 \cw{That doesn't compile with my latex. What do I have to do to get it working?} 320 \cw{That doesn't compile with my latex. What do I have to do to get it working?}
321 \ls{graphviz needs to be installed, and pdflatex needs to be started with -shell-escape}
225 322
226 Red arrows always represents control dependencies, while black arrows represent data dependencies: 323 Red arrows always represents control dependencies, while black arrows represent data dependencies:
227 324
228 \begin{digraphenv}{scale=0.5}{layout1} 325 \begin{digraphenv}{scale=0.5}{layout1}
229 \node{a}{a} 326 \node{a}{a}