comparison doc/design/graal_compiler.tex @ 2679:07aa0a31fffb

Rewrote frame state to be not-so-Java-specific. Clarified and reduced the usage of the term "node".
author Thomas Wuerthinger <thomas@wuerthinger.net>
date Mon, 16 May 2011 14:29:12 +0200
parents b9b0a0aa7ee8
children c5739b99762a
comparison
equal deleted inserted replaced
2678:b9b0a0aa7ee8 2679:07aa0a31fffb
193 \end{figure} 193 \end{figure}
194 194
195 \section{Control Flow} 195 \section{Control Flow}
196 196
197 Control flow is managed in way where the predecessor node contains direct pointers to its successor nodes. 197 Control flow is managed in way where the predecessor node contains direct pointers to its successor nodes.
198 We reserve the term \textit{instruction} for nodes that are embedded in the control flow.
198 This is opposite to the approach taken in the server compiler, where control flow and data flow edges point in the same direction. 199 This is opposite to the approach taken in the server compiler, where control flow and data flow edges point in the same direction.
199 The advantage that we see in our approach is that there is no need for projection nodes in case of control flow splits. 200 The advantage that we see in our approach is that there is no need for projection nodes in case of control flow splits.
200 An \texttt{If} node can directly point to its true and false successors without any intermediate nodes. 201 An \texttt{If} instruction can directly point to its true and false successors without any intermediate nodes.
201 This makes the graph more compact and simplifies graph traversal. 202 This makes the graph more compact and simplifies graph traversal.
202 203
203 Listing \ref{lst:cfg2} shows an example Java program with an if statement where both paths do not contain any instruction with side effects. 204 Listing \ref{lst:cfg2} shows an example Java program with an if statement where both paths do not contain any instruction with side effects.
204 The \texttt{If} instruction can directly point its true and false successors to a \texttt{Merge} node. 205 The \texttt{If} instruction can directly point its true and false successors to a \texttt{Merge} instruction.
205 A \texttt{Phi} node that selects the appropriate value is appended to the \texttt{Merge} node. 206 A \texttt{Phi} node that selects the appropriate value is appended to the \texttt{Merge} instruction.
206 The \texttt{Return} instruction then has a data dependency on the \texttt{Phi} node. 207 The \texttt{Return} instruction then has a data dependency on the \texttt{Phi} node.
207 208
208 \begin{lstlisting}[label=lst:cfg2, caption=Control flow in the graph, captionpos=b] 209 \begin{lstlisting}[label=lst:cfg2, caption=Control flow in the graph, captionpos=b]
209 if (condition) { return 0; } 210 if (condition) { return 0; }
210 else { return 1; } 211 else { return 1; }
211 \end{lstlisting} 212 \end{lstlisting}
212
213 213
214 \begin{figure}[h] 214 \begin{figure}[h]
215 \centering 215 \centering
216 \begin{digraphenv}{scale=0.5}{cfg2} 216 \begin{digraphenv}{scale=0.5}{cfg2}
217 \textnode{entry}{Entry} 217 \textnode{entry}{Entry}
256 } catch(ExtendedException e) { ... } 256 } catch(ExtendedException e) { ... }
257 m3(); 257 m3();
258 throw exception; 258 throw exception;
259 } catch(Exception e) { ... } 259 } catch(Exception e) { ... }
260 \end{lstlisting} 260 \end{lstlisting}
261
262
263 261
264 \begin{figure}[h] 262 \begin{figure}[h]
265 \centering 263 \centering
266 \begin{digraphenv}{scale=0.5}{exc1} 264 \begin{digraphenv}{scale=0.5}{exc1}
267 \textnode{entry}{Entry} 265 \textnode{entry}{Entry}
293 291
294 \section{Loops} 292 \section{Loops}
295 \label{sec:loops} 293 \label{sec:loops}
296 Loops form a first-class construct in the IR that is expressed by specialized IR nodes during all optimization phases. 294 Loops form a first-class construct in the IR that is expressed by specialized IR nodes during all optimization phases.
297 We only compile methods with a control flow where every loop has a single entry point. 295 We only compile methods with a control flow where every loop has a single entry point.
298 This entry point is a \nodename{LoopBegin} node. 296 This entry point is a \nodename{LoopBegin} instruction.
299 This node is connected to a \nodename{LoopEnd} node that merges all control flow paths that do not exit the loop. 297 This instruction is connected to a \nodename{LoopEnd} instruction that merges all control flow paths that do not exit the loop.
300 The edge between the \nodename{LoopBegin} and the \nodename{LoopEnd} is the backedge of the loop. 298 The edge between the \nodename{LoopBegin} and the \nodename{LoopEnd} is the backedge of the loop.
301 It goes from the beginning to the end in order to make the graph acyclic. 299 It goes from the beginning to the end in order to make the graph acyclic.
302 An algorithm that traverses the control flow has to explicitely decide whether it wants to incorporate backedges (i.e., special case of the treatment of \nodename{LoopEnd}) or ignore them. 300 An algorithm that traverses the control flow has to explicitely decide whether it wants to incorporate backedges (i.e., special case of the treatment of \nodename{LoopEnd}) or ignore them.
303 Figure \ref{fig:loop1} shows a simple example with a loop with a single entry and two exits. 301 Figure \ref{fig:loop1} shows a simple example with a loop with a single entry and two exits.
304 302
324 \label{fig:loop1} 322 \label{fig:loop1}
325 \end{figure} 323 \end{figure}
326 324
327 \subsection{Loop Phis} 325 \subsection{Loop Phis}
328 Data flow in loops is modelled with special phi nodes at the beginning and the end of the loop. 326 Data flow in loops is modelled with special phi nodes at the beginning and the end of the loop.
329 The \nodename{LoopEnd} node merges every value that flows into the next loop iteration in associated \nodename{LoopEndPhi} nodes. 327 The \nodename{LoopEnd} instruction merges every value that flows into the next loop iteration in associated \nodename{LoopEndPhi} nodes.
330 A corresponding \nodename{LoopBeginPhi} node that is associated with the loop header has a control flow dependency on the \nodename{LoopEndPhi} node. 328 A corresponding \nodename{LoopBeginPhi} node that is associated with the loop header has a control flow dependency on the \nodename{LoopEndPhi} node.
331 Listing \ref{lst:loop} shows a simple counting loop that is used as an example in the rest of this section. 329 Listing \ref{lst:loop} shows a simple counting loop that is used as an example in the rest of this section.
332 Figure \ref{fig:loop2} shows how the loop is modelled immediately after building the graph. 330 Figure \ref{fig:loop2} shows how the loop is modelled immediately after building the graph.
333 331
334 \begin{lstlisting}[label=lst:loop, caption=Loop example, captionpos=b] 332 \begin{lstlisting}[label=lst:loop, caption=Loop example, captionpos=b]
335 for(int i=0; i<n; ++i) { } 333 for(int i=0; i<n; ++i) { }
336 \end{lstlisting} 334 \end{lstlisting}
337
338
339 335
340 \begin{figure}[h] 336 \begin{figure}[h]
341 \centering 337 \centering
342 \begin{digraphenv}{scale=0.5}{layout2} 338 \begin{digraphenv}{scale=0.5}{layout2}
343 \textnode{BeforeLoop}{Loop entry} 339 \textnode{BeforeLoop}{Loop entry}
472 \label{fig:loop5} 468 \label{fig:loop5}
473 \end{figure} 469 \end{figure}
474 470
475 471
476 \section{Frame States} 472 \section{Frame States}
477 A frame state captures the state of the program in terms of the Java bytecode specification (i.e., the values of the local variables, the operand stack, and the locked monitors). 473 A frame state captures the state of the program like it is seen in by an interpreter of the program.
478 Every deoptimization point needs a valid frame state. 474 The frame state contains the information that is local to the current activation and will therefore disappear during SSA-form constructions or other compiler optimizations.
479 A frame state stays valid as long as the subsequent instructions perform only actions that can safely be reexecuted. 475 For Java, the frame state is defined in terms of the Java bytecode specification (i.e., the values of the local variables, the operand stack, and the locked monitors).
480 Thus, frame states need only be generated for the states after bytecodes that cannot be reexecuted: 476 However, a frame state is not a concept specific to Java (e.g., the Crankshaft JavaScript engine uses frame states in their optimizing compiler to model the values of the AST interpreter).
477
478 Frame states are necessary to support the deoptimization of the program, which is the precondition for performing aggressive optimizations that use optimistic assumptions.
479 Therefore every point in the optimizing compiler that may revert execution back to the interpreter needs a valid frame state.
480 However, the point where the interpreter continues execution need not correspond exactly to the execution position of the compiled code, because many Java bytecode instructions can be safely reexecuted.
481 Thus, frame states need only be generated for the states after instructions that cannot be reexecuted, because they modify the state of the program.
482 Examples for such instructions are:
481 483
482 \begin{itemize} 484 \begin{itemize}
483 \item Array stores: {\tt IASTORE, LASTORE, FASTORE, \\DASTORE, AASTORE, BASTORE, CASTORE, SASTORE} 485 \item Array stores (in Java bytecodes {\tt IASTORE, LASTORE, FASTORE, \\DASTORE, AASTORE, BASTORE, CASTORE, SASTORE})
484 \item Field stores: {\tt PUTSTATIC, PUTFIELD} 486 \item Field stores (in Java bytecodes {\tt PUTSTATIC, PUTFIELD})
485 \item Method calls: {\tt INVOKEVIRTUAL, INVOKESPECIAL, \\INVOKESTATIC, INVOKEINTERFACE} 487 \item Method calls (in Java bytecodes {\tt INVOKEVIRTUAL, INVOKESPECIAL, \\INVOKESTATIC, INVOKEINTERFACE})
486 \item Synchronization: {\tt MONITORENTER, MONITOREXIT} 488 \item Synchronization (in Java bytecodes {\tt MONITORENTER, MONITOREXIT})
487 \end{itemize} 489 \end{itemize}
488 490
489 Within the graph a frame state is represented as a node that is attached to the node that caused it to be generated using a control dependency (see Figure~\ref{fig:fs1}). 491 Within the graph a frame state is represented as a node that is attached to the instruction that caused it to be generated using a control dependency (see Figure~\ref{fig:fs1}).
490 Frame states also have data dependencies on the contents of the state: the local variables and the expression stack. 492 Frame states also have data dependencies on the contents of the state: the local variables and the expression stack.
491 493
492 The frame state at the method beginning does not have to be explicitely in the graph, because it can always be reconstructed at a later stage. 494 The frame state at the method beginning does not have to be explicitely in the graph, because it can always be reconstructed at a later stage.
493 We save the frame state at control flow merges if there is at least one frame state on any control flow path after the immediate dominator. 495 We save the frame state at control flow merges if there is at least one frame state on any control flow path after the immediate dominator.
494 496
513 \label{fig:fs1} 515 \label{fig:fs1}
514 \end{figure} 516 \end{figure}
515 517
516 518
517 \subsection{Guards} 519 \subsection{Guards}
518 A guard node is a node that deoptimizes based on a conditional expression. 520 A guard is a node that deoptimizes based on a conditional expression.
519 Guard nodes are not attached to a certain frame state node, they can move around freely and will always use the correct frame state when the nodes are scheduled (i.e., the last emitted frame state). 521 Guards are not attached to a certain frame state, they can move around freely and will always use the correct frame state when the nodes are scheduled (i.e., the last emitted frame state).
520 The node that is guarded by the deoptimization has a data dependency on the guard, and the guard in turn has a data dependency on the condition and on the most distant node that is postdominated by the guarded node. 522 The node that is guarded by the deoptimization has a data dependency on the guard, and the guard in turn has a data dependency on the condition and on the most distant node that is postdominated by the guarded node.
521 523
522 In the example of Figure~\ref{fig:guard1}, the second load is guarded by a guard, because its receiver might be null (the receiver of the first load is assumed to be non-null). 524 In the example of Figure~\ref{fig:guard1}, the second load is guarded by a guard, because its receiver might be null (the receiver of the first load is assumed to be non-null).
523 The guard is anchored to the control split, because as soon as this node is executed, the second load must be executed as well. 525 The guard is anchored to the control split, because as soon as this node is executed, the second load must be executed as well.
524 In the final schedule, the guard can be placed before or after the first load. 526 In the final schedule, the guard can be placed before or after the first load.
555 \label{fig:guard1} 557 \label{fig:guard1}
556 \end{figure} 558 \end{figure}
557 559
558 A guard can be used to remove branches that have a very low execution frequency and replace them with a conditional jump to deoptimization. 560 A guard can be used to remove branches that have a very low execution frequency and replace them with a conditional jump to deoptimization.
559 561
560 In the example of Figure~\ref{fig:guard2}, an \texttt{If} node was replaced by an anchor and a guard. 562 In the example of Figure~\ref{fig:guard2}, an \texttt{If} instruction was replaced by an anchor and a guard.
561 The anchor takes the place of the \texttt{If} node in the control flow, and is connected to the guard node. 563 The anchor takes the place of the \texttt{If} instruction in the control flow, and is connected to the guard.
562 The guard is now anchored to the most distant node of which the \texttt{If} node was a postdominator. 564 The guard is now anchored to the most distant node of which the \texttt{If} instruction was a postdominator.
563 565
564 \begin{figure}[h] 566 \begin{figure}[h]
565 \centering 567 \centering
566 \begin{digraphenv}{scale=0.5}{guard2} 568 \begin{digraphenv}{scale=0.5}{guard2}
567 start [shape=plaintext, label="..."] 569 start [shape=plaintext, label="..."]
581 \end{digraphenv} 583 \end{digraphenv}
582 \caption{A guard that is fixed to the control flow using an anchor instruction.} 584 \caption{A guard that is fixed to the control flow using an anchor instruction.}
583 \label{fig:guard2} 585 \label{fig:guard2}
584 \end{figure} 586 \end{figure}
585 587
586 At some point during the compilation, guard nodes need to be fixed, which means that appropriate data and control dependencies will be inserted so that they cannot move outside the scope of the associated frame state. 588 At some point during the compilation, guards need to be fixed, which means that appropriate data and control dependencies will be inserted so that they cannot move outside the scope of the associated frame state.
587 This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations. 589 This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations.
588 A simple algorithm for this removal of frame states would be to move all guards as far upwards as possible. 590 A simple algorithm for this removal of frame states would be to move all guards as far upwards as possible.
589 Multiple guards with the same condition and anchor can be merged (see Figure~\ref{fig:guard3}). 591 Multiple guards with the same condition and anchor can be merged (see Figure~\ref{fig:guard3}).
590 592
591 \begin{figure}[h] 593 \begin{figure}[h]
618 \end{digraphenv} 620 \end{digraphenv}
619 \caption{Two loads using the same guard.} 621 \caption{Two loads using the same guard.}
620 \label{fig:guard3} 622 \label{fig:guard3}
621 \end{figure} 623 \end{figure}
622 624
623 Also, if two guards that are anchored to the true and false branch of the same \texttt{If} node have the same condition, they can be merged, so that the resulting guard is anchored at the most distant node of which the \texttt{If} node is a postdominator. 625 Also, if two guards that are anchored to the true and false branch of the same \texttt{If} instruction have the same condition, they can be merged, so that the resulting guard is anchored at the most distant node of which the \texttt{If} instruction is a postdominator.
624 626
625 \section{Conclusions} 627 \section{Conclusions}
626 \label{sec:conclusions} 628 \label{sec:conclusions}
627 This document sketched the strategy for the Graph compiler. 629 This document sketched the strategy for the Graph compiler.
628 We already reached M1 (as defined in Section~\ref{sec:mile}) and have the following plans for M2 to M4: 630 We already reached M1 (as defined in Section~\ref{sec:mile}) and have the following plans for M2 to M4: