truffle: graal/com.oracle.max.graal.doc.initial/graal

comparison graal/com.oracle.max.graal.doc.initial/graal_compiler.tex @ 2949:4db4e8cb6bd6

Updated design document (incorporated comments from Peter Kessler).

author	Thomas Wuerthinger <thomas@wuerthinger.net>
date	Sat, 11 Jun 2011 18:41:40 +0200
parents	5d4aa5672d3d
children

comparison

equal deleted inserted replaced

-:c76db61fbb73
+:4db4e8cb6bd6
 Currently, the Graal~VM is fully functional and runs benchmarks (SciMark, DaCapo) at a similar speed as the HotSpot client compiler.
 \section{Goals}
 The compiler effort aims at rewriting the high-level intermediate representation of C1X with two main goals:
 \begin{description}
-\item[Modularity:] A modular design of the compiler should simplify the implementation of new languages, new back-ends, and new optimizations.
+\item[Modularity:]
-\item[Peak Performance:] A more powerful intermediate representation should enable the implementation of aggressive optimizations that impact the peak performance of the resulting machine code.
+A modular design of the compiler should simplify the implementation of new languages, new back-ends, and new optimizations.
+\item[Peak Performance:]
+A more powerful intermediate representation should enable the implementation of aggressive optimizations that impact the peak performance of the resulting machine code.
+In terms of startup performance, we want to orientate ourselves to the HotSpot server compiler configuration.
 \end{description}
 \section{Design}
 For the implementation of the compiler, we rely on the following design decisions:
 \begin{description}
 \item[Generality:]
 The compiler does not require Java as its input.
 This is achieved by having a graph as the starting point of the compilation and not a Java bytecode array.
 Building the graph from the Java bytecodes must happen before giving a method to the compiler.
 This enables front-ends for different languages (e.g., Ruby or JavaScript) to provide their own graph.
+The support for different languages is constrained by the following two conditions: We only support structured control flow, and the dynamic type system of the language must be expressible using the RiType class.
 Also, there is no dependency on a specific back-end, but the output of the compiler is a graph that can then be converted to a different representation in a final compiler phase.
 \end{description}
 \section{Milestones}
 \label{sec:mile}
 \item \emph{successors} are all nodes that have to be after this node in control flow.
 \item \emph{predecessors} are all nodes whose successors contain this node.
 \end{itemize}
 \item Only inputs and successors can be changed, and changes to them will update the usages and predecessors.
 \item Every node must be able to support cloning and serialization.
-\item The edges of a node also define \textit{happens-before} and \textit{happens-after} relationships as shown in Figure~\ref{fig:directions}.
+\item The edges of a node also define \textit{emitted-before} and \textit{emitted-after} relationships as shown in Figure~\ref{fig:directions}.
+They mean that the machine code for one node is emitted before or after the machine code of another node.
 \end{itemize}
 \begin{figure}[ht]
 \centering
 \begin{digraphenv}{scale=0.5}{graphdirections}
 \data{node1}{inputs}
 \control{node1}{successors}
 \data{usages}{node1}
 \control{predecessors}{node1}
 \node{node2}{Node}
-\textnode{before}{happens-before}
+\textnode{before}{emitted-before}
-\textnode{after}{happens-after}
+\textnode{after}{emitted-after}
 \data{node2}{before}
 \control{node2}{after}
 \data{after}{node2}
 \control{before}{node2}
 \end{digraphenv}
 An example for this would be when the escape analysis finds out that a certain object only escapes because of one method call and this method call is not inlined, because the penalty was to high.
 In this case, we can chose to nevertheless inline the method in order to increase the chances for finding out that the object does not escape.
 \section{Control Flow}
-Control flow is managed in way where the predecessor node contains direct pointers to its successor nodes.
+Control flow is managed in a way where the predecessor node contains direct pointers to its successor nodes.
-We reserve the term \textit{instruction} for nodes that are embedded in the control flow.
 This is opposite to the approach taken in the server compiler, where control flow and data flow edges point in the same direction.
 The advantage that we see in our approach is that there is no need for projection nodes in case of control flow splits.
-An \texttt{If} instruction can directly point to its true and false successors without any intermediate nodes.
+An \texttt{If} node can directly point to its true and false successors without any intermediate nodes.
 This makes the graph more compact and simplifies graph traversal.
+We distinguish between \textit{fixed nodes} that are directly embedded in the control flow and \textit{floating nodes} whose position in the control flow may vary.
-Listing~\ref{lst:cfg2} shows an example Java program with an if statement where both paths do not contain any instruction with side effects.
+Listing~\ref{lst:cfg2} shows an example Java program with an if statement where both paths do not contain any node with side effects.
 Figure~\ref{fig:loopexits} shows the corresponding compiler graph.
-The \texttt{If} instruction can directly point its true and false successors to a \texttt{Merge} instruction.
+The \texttt{If} node can directly point its true and false successors to a \texttt{Merge} node.
-A \texttt{Phi} node that selects the appropriate value is appended to the \texttt{Merge} instruction.
+A \texttt{Phi} node that selects the appropriate value is appended to the \texttt{Merge} node.
-The \texttt{Return} instruction then has a data dependency on the \texttt{Phi} node.
+The \texttt{Return} node then has a data dependency on the \texttt{Phi} node.
 \begin{lstlisting}[label=lst:cfg2, caption=Control flow in the graph., captionpos=b]
 if (condition) { return 0; }
 else { return 1; }
 \end{lstlisting}
 We do not throw runtime exceptions (e.g., \texttt{IndexOutOf\-BoundsException}, \texttt{Null\-Pointer\-Exception}, or \texttt{Out\-Of\-Memory\-Exception}), but deoptimize instead.
 This reduces the places in the compiled code where an exact bytecode location and debug information must be known.
 Additionally, this greatly reduces the number of exception handler edges in the compiled code.
 The main advantage of this technique is however, that we are free in moving around bounds checks, memory allocation, memory accesses with implicit null checks, etc.
-There are only two kinds of instruction that need explicit exception edges, because they are the only instructions that can throw exceptions in compiled code: \texttt{Throw} instructions and \texttt{Invoke} instructions.
+There are only two kinds of nodes that need explicit exception edges, because they are the only nodes that can throw exceptions in compiled code: \texttt{Throw} nodes and \texttt{Invoke} nodes.
-They are modelled as instructions with an additional control flow continuation that points to an \texttt{ExceptionDispatch} instruction.
+They are modelled as nodes with an additional control flow continuation that points to an \texttt{ExceptionDispatch} node.
-The exception dispatch instruction decides based on the type of the exception object whether the control should flow to the catch handler or to another exception dispatch.
+The exception dispatch node decides based on the type of the exception object whether the control should flow to the catch handler or to another exception dispatch.
-If there is no catch handler in the currently compiled method, then the control flows into the \texttt{Unwind} instruction that handles the exception by forwarding it to the caller.
+If there is no catch handler in the currently compiled method, then the control flows into the \texttt{Unwind} node that handles the exception by forwarding it to the caller.
 Listing~\ref{lst:exc1} shows an example Java program with nested try blocks and Figure \ref{fig:exc1} shows the corresponding compiler graph.
 \begin{lstlisting}[label=lst:exc1, caption=Exception dispatch in the compiler graph., captionpos=b]
 try { m1();
 try { m2();
 \section{Loops}
 \label{sec:loops}
 Loops form a first-class construct in the IR that is expressed by specialized IR nodes during all optimization phases.
 We only compile methods with a control flow where every loop has a single entry point.
-This entry point is a \nodename{LoopBegin} instruction.
+This entry point is a \nodename{LoopBegin} node.
-This instruction is connected to a \nodename{LoopEnd} instruction that merges all control flow paths that do not exit the loop.
+This node is connected to a \nodename{LoopEnd} node that merges all control flow paths that do not exit the loop.
 The edge between the \nodename{LoopBegin} and the \nodename{LoopEnd} is the backedge of the loop.
 It goes from the beginning to the end in order to make the graph acyclic.
 An algorithm that traverses the control flow has to explicitely decide whether it wants to incorporate backedges (i.e., special case of the treatment of \nodename{LoopEnd}) or ignore them.
 Figure \ref{fig:loop1} shows a simple example with a loop with a single entry and two exits.
 \centering
 \begin{digraphenv}{scale=0.5}{layout1}
 \textnode{BeforeLoop}{Loop entry}
 \textnode{Exit1}{First loop exit}
 \textnode{Exit2}{Second loop exit}
-\nodesplit{LoopBegin}{LoopBegin}
+\node{LoopBegin}{LoopBegin}
 \node{LoopEnd}{LoopEnd}
 \nodesplit{If1}{If}
 \nodesplit{If2}{If}
-\controllabel{LoopBegin:succ1}{LoopEnd}
+\data{LoopEnd}{LoopBegin}
-\controllabel{LoopBegin:succ2}{If1}
+\control{LoopBegin}{If1}
 \controllabel{If1:succ1}{If2}
-\controllabel{If2:succ1}{LoopEnd}
+\controllabel{If2:succ1}{LoopBody}
+\textnode{LoopBody}{Loop body}
+\control{LoopBody}{LoopEnd}
 \controllabel{BeforeLoop}{LoopBegin}
 \controllabel{If1:succ2}{Exit1}
 \controllabel{If2:succ2}{Exit2}
 \end{digraphenv}
 \caption{A simple loop with two exits.}
 \label{fig:loop1}
 \end{figure}
 \subsection{Loop Phis}
 Data flow in loops is modeled with special phi nodes at the beginning and the end of the loop.
-The \nodename{LoopEnd} instruction merges every value that flows into the next loop iteration in associated \nodename{LoopEndPhi} nodes.
+The \nodename{LoopEnd} node merges every value that flows into the next loop iteration in associated \nodename{LoopEndPhi} nodes.
 A corresponding \nodename{LoopBeginPhi} node that is associated with the loop header has a control flow dependency on the \nodename{LoopEndPhi} node.
 Listing~\ref{lst:loop} shows a simple counting loop that is used as an example in the rest of this section.
 Figure~\ref{fig:loop2} shows how the loop is modelled immediately after building the graph.
 \begin{lstlisting}[label=lst:loop, caption=Loop example that counts from 0 to n-1., captionpos=b]
 \textnode{BeforeLoop}{Loop entry}
 \textnode{Exit}{Loop exit}
 \textnode{n}{n}
 \textnode{Constant0}{0}
 \textnode{Constant1}{1}
-\nodesplit{LoopBegin}{LoopBegin}
+\node{LoopBegin}{LoopBegin}
 \node{LoopEnd}{LoopEnd}
 \nodesplit{If1}{If}
-\controllabel{LoopBegin:succ1}{LoopEnd}
+\data{LoopEnd}{LoopBegin}
-\controllabel{LoopBegin:succ2}{If1}
+\control{LoopBegin}{If1}
 \nodebi{Compare}{&lt;}
 \nodebi{LoopBeginPhi}{LoopBeginPhi}
 \nodebi{Add}{+}
 \datalabel{Add:in1}{LoopBeginPhi}
 \datalabel{Add:in2}{Constant1}
 \datalabel{LoopBeginPhi:in1}{LoopBegin}
 \datalabel{LoopBeginPhi:in2}{Constant0}
 \datalabel{Compare:in1}{LoopBeginPhi}
 \datalabel{Compare:in2}{n}
 \data{If1}{Compare}
-\controllabel{If1:succ1}{LoopEnd}
+\controllabel{If1:succ1}{LoopBody}
+\textnode{LoopBody}{Loop body}
+\control{LoopBody}{LoopEnd}
 \controllabel{BeforeLoop}{LoopBegin}
 \controllabel{If1:succ2}{Exit}
 \end{digraphenv}
 \caption{Graph for a loop counting from 0 to n-1.}
 \label{fig:loop2}
 \textnode{BeforeLoop}{Loop entry}
 \textnode{Exit}{Loop exit}
 \textnode{n}{n}
 \textnode{Constant0}{0}
 \textnode{Constant1}{1}
-\nodesplit{LoopBegin}{LoopBegin}
+\node{LoopBegin}{LoopBegin}
 \node{LoopEnd}{LoopEnd}
 \nodesplit{If1}{If}
-\controllabel{LoopBegin:succ1}{LoopEnd}
+\data{LoopEnd}{LoopBegin}
-\controllabel{LoopBegin:succ2}{If1}
+\control{LoopBegin}{If1}
 \nodebi{Compare}{&lt;}
 \nodetri{LoopCounter}{LoopCounter}
 \datalabel{LoopCounter:in1}{LoopBegin}
 \datalabeltext{LoopCounter:in2}{Constant0}{init}
 \datalabeltext{LoopCounter:in3}{Constant1}{stride}
 \datalabel{Compare:in1}{LoopCounter}
 \datalabel{Compare:in2}{n}
 \data{If1}{Compare}
-\controllabel{If1:succ1}{LoopEnd}
+\controllabel{If1:succ1}{LoopBody}
+\textnode{LoopBody}{Loop body}
+\control{LoopBody}{LoopEnd}
 \controllabel{BeforeLoop}{LoopBegin}
 \controllabel{If1:succ2}{Exit}
 \end{digraphenv}
 \caption{Graph after loop counter transformation.}
 \label{fig:loop3}
 \subsection{Bounded Loops}
 If the total maximum number of iterations of a loop is fixed, then the loop is converted into a bounded loop.
 The total number of iterations always denotes the number of full iterations of the loop with the control flowing from the loop begin to the loop end.
 If the total number of iterations is reached, the loop is exited directly from the loop header.
+The representation of the bounded loop in the graph should support reasoning about the loop, but does not specify how the loop will later be converted to machine code.
 In the example, we can infer from the loop exit with the comparison on the loop counter that the total number of iterations of the loop is limited to n.
 Figure \ref{fig:loop4} shows the compiler graph of the example loop after the bounded loop transformation.
+If there are no other exits out of the loop, then the number of iterations specified as the input to the bounded loop node is also the exact number of loop iterations.
+Loops with the same number of iterations can be merged into a single loop.
+We also want to support loop splitting in order to find simple loops that are candidates for vectorization.
 \begin{figure}[ht]
 \centering
 \begin{digraphenv}{scale=0.5}{layout4}
 \textnode{BeforeLoop}{Loop entry}
 \textnode{Exit}{Loop exit}
 \textnode{n}{n}
 \textnode{Constant0}{0}
 \textnode{Constant1}{1}
-\nodesplittri{LoopBegin}{BoundedLoopBegin}
+\nodesplit{LoopBegin}{BoundedLoopBegin}
 \node{LoopEnd}{LoopEnd}
-\controllabel{LoopBegin:succ1}{LoopEnd}
+\data{LoopEnd}{LoopBegin}
-\controllabel{LoopBegin:succ2}{LoopEnd}
+\controllabel{LoopBegin:succ1}{LoopBody}
-\controllabel{LoopBegin:succ3}{Exit}
+\textnode{LoopBody}{Loop body}
+\control{LoopBody}{LoopEnd}
+\controllabel{LoopBegin:succ2}{Exit}
 \nodetri{LoopCounter}{LoopCounter}
 \datalabel{LoopCounter:in1}{LoopBegin}
 \datalabeltext{LoopCounter:in2}{Constant0}{init}
 \datalabeltext{LoopCounter:in3}{Constant1}{stride}
 \data{LoopBegin}{n}
 \end{figure}
 \subsection{Vectorization}
 If we have now a bounded loop with no additional loop exit and no associated phi nodes (only associated loop counters), we can vectorize the loop.
-We replace the loop header with a normal instruction that produces a vector of values from 0 to the number of loop iterations minus 1.
+We replace the loop header with a normal node that produces a vector of values from 0 to the number of loop iterations minus 1.
 The loop counters are replaced with \texttt{VectorAdd} and \texttt{VectorMul} nodes.
 The vectorization is only possible if every node of the loop can be replaced with a corresponding vector node.
 Figure \ref{fig:loop5} shows the compiler graph of the example loop after vectorization.
 The vector nodes all work on an ordered list of integer values and are subject to canonicalization and global value numbering like any other node.
 For Java, the frame state is defined in terms of the Java bytecode specification (i.e., the values of the local variables, the operand stack, and the locked monitors).
 However, a frame state is not a concept specific to Java (e.g., the Crankshaft JavaScript engine uses frame states in their optimizing compiler to model the values of the AST interpreter).
 Frame states are necessary to support the deoptimization of the program, which is the precondition for performing aggressive optimizations that use optimistic assumptions.
 Therefore every point in the optimizing compiler that may revert execution back to the interpreter needs a valid frame state.
-However, the point where the interpreter continues execution need not correspond exactly to the execution position of the compiled code, because many Java bytecode instructions can be safely reexecuted.
+However, the point where the interpreter continues execution need not correspond exactly to the execution position of the compiled code, because many Java bytecode nodes can be safely reexecuted.
-Thus, frame states need only be generated for the states after instructions that cannot be reexecuted, because they modify the state of the program.
+Thus, frame states need only be generated for the states after nodes that cannot be reexecuted, because they modify the state of the program.
-Examples for such instructions are:
+Examples for such nodes are:
 \begin{itemize}
 \item Array stores (in Java bytecodes {\tt IASTORE, LASTORE, FASTORE, \\DASTORE, AASTORE, BASTORE, CASTORE, SASTORE})
 \item Field stores (in Java bytecodes {\tt PUTSTATIC, PUTFIELD})
 \item Method calls (in Java bytecodes {\tt INVOKEVIRTUAL, INVOKESPECIAL, \\INVOKESTATIC, INVOKEINTERFACE})
 \item Synchronization (in Java bytecodes {\tt MONITORENTER, MONITOREXIT})
 \end{itemize}
-Within the graph a frame state is represented as a node that is attached to the instruction that caused it to be generated using a control dependency (see Figure~\ref{fig:fs1}).
+Within the graph a frame state is represented as a node that is attached to the node that caused it to be generated using a control dependency (see Figure~\ref{fig:fs1}).
 Frame states also have data dependencies on the contents of the state: the local variables and the expression stack.
 The frame state at the method beginning does not have to be explicitely in the graph, because it can always be reconstructed at a later stage.
 We save the frame state at control flow merges if there is at least one frame state on any control flow path between a node and its immediate dominator.
 \label{fig:fs1}
 \end{figure}
 A deoptimization node needs a valid frame state that specifies the location and state where the interpreter should continue.
-The algorithm for constructing frame states makes sure that every possible location in the graph has a well-defined frame state that can be used by a deoptimization instruction.
+The algorithm for constructing frame states makes sure that every possible location in the graph has a well-defined frame state that can be used by a deoptimization node.
-Therefore, there are no direct links between the deoptimization instruction and its frame state thus allowing the deoptimization instructions to move freely around.
+Therefore, there are no direct links between the deoptimization node and its frame state thus allowing the deoptimization nodes to move freely around.
 \subsection{Partial Escape Analysis}
 A partial escape analysis can help to further reduce the number of frame states.
 A field or array store does not create a new frame state, when the object that is modified did not have a chance to escape between its creation and the store.
 \subsection{Guards}
 A guard is a node that deoptimizes based on a conditional expression.
 Guards are not attached to a certain frame state, they can move around freely and will always use the correct frame state when the nodes are scheduled (i.e., the last emitted frame state).
 The node that is guarded by the deoptimization has a data dependency on the guard and the guard in turn has a data dependency on the condition.
-A guard must not be moved above any \texttt{If} nodes.
+A guard may only be executed if it is guaranteed that the guarded node is executed too (if no exceptions are thrown).
-Therefore, we use \texttt{Anchor} instructions after a control flow split and a data dependency from the guard to this anchor.
+Therefore, we use \texttt{Anchor} nodes after a control flow split and a data dependency from the guard to this anchor.
-The anchor is the most distant instruction that is postdominated by the guarded instruction and the guard can be scheduled anywhere between those two nodes.
+The anchor is the most distant node that is postdominated by the guarded node and the guard can be scheduled anywhere between those two nodes.
-This ensures maximum flexibility for the guard instruction and guarantees that we only deoptimize if the control flow would have reached the guarded instruction (without taking exceptions into account).
+This ensures maximum flexibility for the guard node and guarantees that we only deoptimize if the control flow would have reached the guarded node (without taking exceptions into account).
 To illustrate the strengths of this approach, we show the graph for the Java code snippet shown in \ref{lst:guard1}.
 The example looks artificial, but in case of method inlining, this is a pattern that is not unlikely to be present in a normal Java program.
 Figure \ref{fig:guard0} shows the compiler graph for the example method after graph building.
-The field stores are both represented by a single instruction and the null check that is implicitely incorporated in the field store.
+The field stores are both represented by a single node and the null check that is implicitely incorporated in the field store.
 \begin{lstlisting}[label=lst:guard1, caption=Example method that demonstrates the strengths of modelling the guards explicitely., captionpos=b]
 void init(Point p) {
 if (p != null) {
 p.x = 0;
 \end{digraphenv}
 \caption{Initial graph with the two field stores.}
 \label{fig:guard0}
 \end{figure}
-Figure~\ref{fig:guard1} shows the example graph at a later compilation phase when the field store instructions are lowered to memory store instructions and explicitely modelled null check guards.
+Figure~\ref{fig:guard1} shows the example graph at a later compilation phase when the field store nodes are lowered to memory store nodes and explicitely modelled null check guards.
-The guards are attached to anchor instructions that delimit their possible schedule.
+The guards are attached to anchor nodes that delimit their possible schedule.
-The first guard must not be moved outside the \texttt{if} block; the second guard may be moved before the \texttt{If} instruction, because at this point it is already guaranteed that the second store is executed.
+The first guard must not be moved outside the \texttt{if} block; the second guard may be moved before the \texttt{If} node, because at this point it is already guaranteed that the second store is executed.
 \begin{figure}[ht]
 \centering
 \begin{digraphenv}{scale=0.5}{guard1}
 	\textnode{entry}{Entry}
 \node{cmpnull}{NonNull}
 \textnode{p}{p}
 	\textnode{const0}{0}
 \nodeguard{guard1}{Guard}
 \nodeguard{guard2}{Guard}
-\nodetrisplit{store1}{MemStore 16 (int)}
+\nodetrisplit{store1}{MemStore 16 (4 bytes)}
-\nodetrisplit{store2}{MemStore 20 (int)}
+\nodetrisplit{store2}{MemStore 20 (4 bytes)}
 \nodeframestate{fs1}{FrameState}
 \nodeframestate{fs2}{FrameState}
 \data{store1:in1}{p}
 \data{store2:in1}{p}
 \data{store1:in2}{const0}
 \end{digraphenv}
 \caption{A load guarded by a null check guard.}
 \label{fig:guard1}
 \end{figure}
-The first guard can be easily removed, because it is guarded by an \texttt{If} instruction that checks the same condition.
+The first guard can be easily removed, because it is guarded by an \texttt{If} node that checks the same condition.
 Therefore we can remove the guard and the anchor from the graph and this gives us the graph shown in Figure \ref{fig:guard2}.
-There is another optimization for guard instructions: If two guards that are anchored to the true and false branch of the same \texttt{If} instruction have the same condition, they can be merged, so that the resulting guard is anchored at the most distant node of which the \texttt{If} instruction is a postdominator.
+There is another optimization for guard nodes: If two guards that are anchored to the true and false branch of the same \texttt{If} node have the same condition, they can be merged, so that the resulting guard is anchored at the most distant node of which the \texttt{If} node is a postdominator.
 \begin{figure}[ht]
 \centering
 \begin{digraphenv}{scale=0.5}{guard2}
 \node{return}{Return}
 \node{cmpnull}{NonNull}
 \textnode{p}{p}
 	\textnode{const0}{0}
 \nodeguard{guard2}{Guard}
-\nodetrisplit{store1}{MemStore 16 (int)}
+\nodetrisplit{store1}{MemStore 16 (4 bytes)}
-\nodetrisplit{store2}{MemStore 20 (int)}
+\nodetrisplit{store2}{MemStore 20 (4 bytes)}
 \nodeframestate{fs1}{FrameState}
 \nodeframestate{fs2}{FrameState}
 \data{store1:in1}{p}
 \data{store2:in1}{p}
 \data{store1:in2}{const0}
 \end{figure}
 The remaining guard can now be moved above the \texttt{If} condition and be used to eliminate the need for the \texttt{If} node.
 From this point on, the guard can however no longer be moved below the first memory store.
 We use a control dependency from the guard to the field store to express this condition.
-The link between the second store and the guard and the control flow merge instruction is no longer necessary.
+The link between the second store and the guard and the control flow merge node is no longer necessary.
 \begin{figure}[ht]
 \centering
 \begin{digraphenv}{scale=0.5}{guard3}
 	\textnode{entry}{Entry}
 \node{return}{Return}
 \node{cmpnull}{NonNull}
 \textnode{p}{p}
 	\textnode{const0}{0}
 \nodeguard{guard2}{Guard}
-\nodetrisplit{store1}{MemStore 16 (int)}
+\nodetrisplit{store1}{MemStore 16 (4 bytes)}
-\nodetrisplit{store2}{MemStore 20 (int)}
+\nodetrisplit{store2}{MemStore 20 (4 bytes)}
 \nodeframestate{fs1}{FrameState}
 \nodeframestate{fs2}{FrameState}
 \data{store1:in1}{p}
 \data{store2:in1}{p}
 \data{store1:in2}{const0}
 \node{return}{Return}
 \node{cmpnull}{NonNull}
 \textnode{p}{p}
 	\textnode{const0}{0}
 \nodeguard{guard2}{Guard}
-\nodetrisplit{store1}{MemStore 16 (int)}
+\nodetrisplit{store1}{MemStore 16 (4 bytes)}
-\nodetrisplit{store2}{MemStore 20 (int)}
+\nodetrisplit{store2}{MemStore 20 (4 bytes)}
 \data{store1:in1}{p}
 \data{store2:in1}{p}
 \data{store1:in2}{const0}
 \data{store2:in2}{const0}
 \data{store2:in3}{guard2}
 \node{return}{Return}
 \node{cmpnull}{NonNull}
 \textnode{p}{p}
 	\textnode{const0}{0}
 \nodeguard{guard2}{Guard}
-\nodetrisplit{store1}{MemStore 16 (long)}
+\nodetrisplit{store1}{MemStore 16 (8 bytes)}
 \data{store1:in1}{p}
 \data{store1:in2}{const0}
 \data{guard2:in1}{anchor1}
 \data{guard2:in2}{cmpnull}
 \control{guard2}{store1}
 \end{digraphenv}
 \caption{After coalescing the two memory stores.}
 \label{fig:guard5}
 \end{figure}
-A memory store that immediately follows a null check guard instruction on the same object, can be combined into a store with an implicit null check (that deoptimizes instead of throwing the exception).
+A memory store that immediately follows a null check guard node on the same object, can be combined into a store with an implicit null check (that deoptimizes instead of throwing the exception).
 Therefore, we can remove the guard again and also the anchor is no longer necessary.
 Figure~\ref{fig:guard6} shows now that fully optimized graph that is generated for Listing~\ref{lst:guard1}.
 \begin{figure}[ht]
 \centering
 \begin{digraphenv}{scale=0.5}{guard6}
 	\textnode{entry}{Entry}
 \node{return}{Return}
 \textnode{p}{p}
 	\textnode{const0}{0}
-\nodetrisplit{store1}{DeoptimizingMemStore 16 (long)}
+\nodetrisplit{store1}{DeoptimizingMemStore 16 (8 bytes)}
 \data{store1:in1}{p}
 \data{store1:in2}{const0}
 \control{entry}{store1}
 \controllabel{store1:succ1}{return}
 \end{digraphenv}
 \section{Conclusions}
 \label{sec:conclusions}
 This document sketched the strategy for the Graph compiler.
-We already reached M1 (as defined in Section~\ref{sec:mile}) and have the following plans for M2 to M4:
+We have the following plans for M1 to M4:
 \begin{description}
+\item[M1:] May 15th, 2011
 \item[M2:] June 30th, 2011
 \item[M3:] August 15th, 2011
 \item[M4:] September 30th, 2011
 \end{description}
 After we reach M4, we want to create a new project road map that further improves the Graal compiler with respect to its two main goals: Modularity and peak performance.

Mercurial > hg > truffle

comparison graal/com.oracle.max.graal.doc.initial/graal_compiler.tex @ 2949:4db4e8cb6bd6