Mercurial > hg > truffle

Binary file doc/design/graal_compiler.pdf has changed
--- a/doc/design/graal_compiler.tex	Wed May 04 16:36:55 2011 +0200
+++ b/doc/design/graal_compiler.tex	Wed May 04 17:02:11 2011 +0200
@@ -25,6 +25,7 @@

 \newcommand\TODO[1]{\mynote{TODO}{#1}}
 \newcommand\cw[1]{\mynote{CW}{#1}}
+\newcommand\ls[1]{\mynote{LS}{#1}}
 \newcommand\nodename[1]{\texttt{#1}}


@@ -329,6 +330,12 @@
 The most important aspect of a compiler is the data structure that holds information about an executable piece of code, called \emph{intermediate representation}~(IR).
 The IR used in the Graal Compiler (simply referred to as \emph{the compiler} in the rest of this document) was designed in such a way as to allow for extensive optimizations, easy traversal, compact storage and efficient processing.

+\subsubsection{The Graph Data Structure}
+\begin{itemize}
+    \item A graph deals out ids for new nodes and can be queried for the node corresponding to a given id.
+    \item Graphs can manage side data structures, which will be automatically invalidated and lazily recomputed whenever the graph changes. Examples for side data structures are dominator trees and temporary schedules. These side data structures will usually be understood by more than one optimization.
+\end{itemize}
+
 \subsubsection{The Node Data Structure}
 \begin{itemize}
     \item Each node is always associated with a graph.
@@ -352,42 +359,168 @@
     \item Nodes cannot be reassigned to another graph, they are cloned instead \cw{Why should there be the need for more than one graph when compiling a method?}
 \end{itemize}

-\subsubsection{The Graph Data Structure}
-\begin{itemize}
-    \item A graph deals out ids for new nodes and can be queried for the node corresponding to a given id.
-    \item Graphs can manage side data structures, which will be automatically invalidated and lazily recomputed whenever the graph changes. Examples for side data structures are dominator trees and temporary schedules. These side data structures will usually be understood by more than one optimization.
-    \item Graphs are
-\end{itemize}
-
 \subsection{Frame States}
 Frame states capture the state of the program, in terms of the source representation (e.g., Java state: local variables, expressions, ...).
-Whenever a safepoint is reached or \cw{why is that an ``or'', both is basically the same} a deoptimization is needed a valid frame state needs to be available.
-A frame state is valid as long as the program performs only actions that can safely be reexecuted (e.g., operations on local variables, etc.).
-Thus, frame states need only be generated for bytecodes that cannot be reexecuted: putfield, astore, invoke, monitorenter/exit, ...
-
-Within the node graph a frame state is represented as a node that is fixed between the node that caused it to be generated (data dependency) and the node that invalidates it (control dependency).
+Whenever a safepoint is reached or \cw{why is that an ``or'', both is basically the same} \ls{uncommon traps can be introduced at other points, e.g., at an if branch that isn't compiled} a deoptimization is needed a valid frame state needs to be available.
+A frame state is valid as long as the program performs only actions that can safely be reexecuted (e.g., operations on local variables, loads, etc.).
+Thus, frame states need only be generated for bytecodes that cannot be reexecuted:

-Deopmization nodes are not fixed to a certain frame state node, they can move around freely and will always use the correct frame state.
-At some point during the compilation, deoptimization nodes need to be fixed, which means that appropriate data and control dependencies will be inserted so that it can not move outside the scope of the associated frame state.
-This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations.
+\begin{itemize}
+    \item Array stores: {\tt IASTORE, LASTORE, FASTORE, \\DASTORE, AASTORE, BASTORE, CASTORE, SASTORE}
+    \item Field stores: {\tt PUTSTATIC, PUTFIELD}
+    \item Method calls: {\tt INVOKEVIRTUAL, INVOKESPECIAL, \\INVOKESTATIC, INVOKEINTERFACE}
+    \item Memory allocations: {\tt NEW, NEWARRAY, ANEWARRAY, \\MULTIANEWARRAY}
+    \item Synchronization: {\tt MONITORENTER, MONITOREXIT}
+\end{itemize}

-Frame states should be represented using a delta-encoding.
-This will make them significantly smaller and will make inlining, etc. much easier.
-In later compilation phases unnecessary frame states might be removed, using a mark-and-merge algorithm.
+Within the node graph a frame state is represented as a node that is fixed to the node that caused it to be generated (control dependency).


+\begin{figure}[h]
+  \label{fig:fs1}
+  \centering
+\begin{digraphenv}{scale=0.5}{fs1}
+    \nodetrisplit{store1}{ArrayStore}
+    \nodebi{load1}{ArrayLoad}
+    \controllabel{store1:succ1}{load1}
+    \nodetrisplit{store2}{ArrayStore}
+    \control{load1}{store2}
+    end [shape=plaintext, label="...", width="2.0"]
+    store2:succ1:s -> end:n [color=red];
+    %
+    \nodeframestate{fs1}{FrameState}
+    \controllabel{store1:succ2}{fs1}
+    \nodeframestate{fs2}{FrameState}
+    \controllabel{store2:succ2}{fs2}
+\end{digraphenv}
+  \caption{Simple example using two frame states.}
+\end{figure}
+
+FrameStates also have data dependencies on the contents of the state: the local variables and the expression stack.
+
+\subsection{Deoptimization and Uncommon Traps}
+Uncommon trap nodes are not fixed to a certain frame state node, they can move around freely and will always use the correct frame state.
+The node that is guarded by the deoptimization has a data dependency on the trap, and the trap in turn has a data dependency on the condition and on the most distant node that is postdominated by the guarded node.
+
+\begin{figure}[h]
+  \label{fig:trap1}
+  \centering
+\begin{digraphenv}{scale=0.5}{trap1}
+    \nodesplit{if}{If}
+    \node{split1}{Split}
+    \controllabel{if:succ1}{split1}
+    nop [shape=plaintext, label="..."]
+    \control{split1}{nop}
+    %
+    \node{split2}{Split}
+    \controllabel{if:succ2}{split2}
+    \nodebi{load1}{MemLoad}
+    \control{split2}{load1}
+    \nodebi{load2}{MemLoad}
+    \control{load1}{load2}
+    \control{load2}{merge}
+    \node{merge}{Merge}
+    \control{nop}{merge}
+    %
+    \nodeconst{o1}{obj1}
+    \nodeconst{o2}{obj2}
+    \datalabel{load1:in2}{o1}
+    \datalabel{load2:in2}{o2}
+    \nodetrap{trap}{Trap}
+    \node{cmpnull}{IsNull}
+    \data{cmpnull}{o2}
+    \datalabel{trap:in2}{cmpnull}
+    \datalabel{load2:in1}{trap}
+    \datalabel{trap:in1}{split2}
+\end{digraphenv}
+  \caption{In this example, the second load is guarded by an uncommon trap, because its receiver might be null (the receiver of the load is assumed to be non-null).
+The trap is anchored to the control split, because as soon as this node is executed the second load must be executed as well.
+In the final scheduling the trap can be placed before or after the first load.}
+\end{figure}
+
+Another type of uncommon trap is a guard, which is used to remove branches that have a very low execution frequency from the compiled code.
+
+\begin{figure}[h]
+  \label{fig:trap2}
+  \centering
+\begin{digraphenv}{scale=0.5}{trap2}
+    start [shape=plaintext, label="..."]
+    start2 [shape=plaintext, label=""]
+    start3 [shape=plaintext, label=""]
+    \control{start}{guard}
+    \node{guard}{Guard}
+    \nodebi{load1}{MemLoad}
+    \control{guard}{load1}
+    \control{load1}{nop}
+    nop [shape=plaintext, label="..."]
+    %
+    \nodetrap{trap}{Trap}
+    \datalabel{trap:in1}{start2}
+    \datalabel{trap:in2}{start3}
+    \data{guard}{trap}
+\end{digraphenv}
+  \caption{In this example the If from the previous example was replaced by a guard and an uncommon trap.
+The guard takes the place of the If in the control flow, and is connected to the trap node.
+The uncommon trap is now anchored to the most distant node of which the If was a postdominator.}
+\end{figure}
+
+At some point during the compilation, trap nodes need to be fixed, which means that appropriate data and control dependencies will be inserted so that they cannot move outside the scope of the associated frame state.
+This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations.
+A simple algorithm for this removal of FrameStates would be to move all traps as far upwards as possible.
+
+
+Multiple Traps with the same condition and anchor can be merged:
+
+\begin{figure}[h]
+  \label{fig:trap3}
+  \centering
+\begin{digraphenv}{scale=0.5}{trap3}
+    \nodesplit{if}{If}
+    \node{split1}{Split}
+    \controllabel{if:succ1}{split1}
+    nop [shape=plaintext, label="..."]
+    \control{split1}{nop}
+    %
+    \node{split2}{Split}
+    \controllabel{if:succ2}{split2}
+    \nodebi{load1}{MemLoad}
+    \control{split2}{load1}
+    \nodebi{load2}{MemLoad}
+    \control{load1}{load2}
+    \control{load2}{merge}
+    \node{merge}{Merge}
+    \control{nop}{merge}
+    %
+    \nodeconst{o}{obj}
+    \datalabel{load1:in2}{o}
+    \datalabel{load2:in2}{o}
+    \nodetrap{trap}{Trap}
+    \node{cmpnull}{IsNull}
+    \data{cmpnull}{o}
+    \datalabel{trap:in2}{cmpnull}
+    \datalabel{load2:in1}{trap}
+    \datalabel{load1:in1}{trap}
+    \datalabel{trap:in1}{split2}
+\end{digraphenv}
+  \caption{Two loads guarded by the same Trap.}
+\end{figure}
+
+Also, if two Traps that are anchored to the true and false branch of the same If have the same condition, they can be merged, so that the resulting Trap is anchored at the most distant node of which the If is a postdominator.
+
+%Frame states should be represented using a delta-encoding.
+%This will make them significantly smaller and will make inlining, etc. much easier.
+%In later compilation phases unnecessary frame states might be removed, using a mark-and-merge algorithm.

 \subsection{Graph Building}
 \begin{itemize}
     \item The graph built by the initial parser (called \emph{GraphBuilder}) should be as close to the source representation (bytecode, ...) as possible.
     \item All nodes should be able to immediately lower themselves to a machine-level representation. \cw{What is that?  You mean every node has x86 specific code that spits out machine code?  Hope you are joking...} This allows for easier compiler development, and also leads to a compiler that is very flexible in the amount of optimizations it performs (e.g. recompilation of methods with more aggressive optimizations).
-    \item
 \end{itemize}

 \subsection{Graphical Representation}
 The graphs in this document use the following node layout:

-\begin{digraphenv}{scale=0.5}{layout01}
+\begin{digraphenv}{scale=0.5}{graphrep1}
 \node{node1}{nop}
 \nodebi{node2}{+}
 \nodetri{node3}{phi}
@@ -395,10 +528,11 @@
 \end{digraphenv}

 \cw{That doesn't compile with my latex.  What do I have to do to get it working?}
+\ls{graphviz needs to be installed, and pdflatex needs to be started with -shell-escape}

 Red arrows always represents control dependencies, while black arrows represent data dependencies:

-\begin{digraphenv}{scale=0.5}{layout1}
+\begin{digraphenv}{scale=0.5}{graphrep2}
 \node{a}{a}
 \node{b}{b}
 \nodesplit{if}{if}
--- a/doc/design/graphdrawing.tex	Wed May 04 16:36:55 2011 +0200
+++ b/doc/design/graphdrawing.tex	Wed May 04 17:02:11 2011 +0200
@@ -21,7 +21,7 @@
   }
 }

-\NewEnviron{digraphenv}[2]{\digraph[#1]{#2}{  \BODY }}
+\NewEnviron{digraphenv}[2]{\digraph[#1]{#2}{ nodesep="0.1"; \BODY }}

 \newcommand{\control}[2]{#1:successors:s -> #2:predecessors:n [color=red];}
 \newcommand{\controllabel}[2]{#1:s -> #2:predecessors:n [color=red];}
@@ -42,7 +42,7 @@
 \newcommand{\nodeend}[0]{\genericnodestart{end} \portempty \portinput{inputs} \genericnodelabel{end}{lightyellow} <TD></TD></TR></TABLE></TD><TD></TD></TR></TABLE>>]}

 \newcommand{\nodeconst}[2]{#1 [margin="0.05, 0.05", shape=plaintext, label=< <TABLE BORDER="0" CELLSPACING="0"><TR><TD BORDER="1" BGCOLOR="white"><FONT POINT-SIZE="12">#2</FONT></TD></TR><TR><TD WIDTH="15" HEIGHT="5" PORT="usages" BGCOLOR="lightgrey"></TD></TR></TABLE>>]}
-\newcommand{\nodeframestate}[2]{#1 [margin="0.05, 0.05", shape=plaintext, label=< <TABLE BORDER="0" CELLSPACING="0"><TR><TD WIDTH="15" HEIGHT="5" PORT="predecessors" BGCOLOR="rosybrown1"></TD></TR><TR><TD BORDER="1" BGCOLOR="lightblue"><FONT POINT-SIZE="12">#2</FONT></TD></TR></TABLE>>]}
+\newcommand{\nodeframestate}[2]{#1 [margin="0.05, 0.05", shape=plaintext, label=< <TABLE BORDER="0" CELLSPACING="0"><TR><TD WIDTH="15" HEIGHT="5" PORT="predecessors" BGCOLOR="rosybrown1"></TD><TD WIDTH="15" HEIGHT="5"></TD><TD HEIGHT="5" PORT="inputs" BGCOLOR="lightgrey"></TD></TR><TR><TD BORDER="1" COLSPAN="3" BGCOLOR="lightblue"><FONT POINT-SIZE="12">#2</FONT></TD></TR></TABLE>>]}


 \newcommand{\node}[2]{\genericnodestart{#1} \portempty \portinput{inputs} \genericnodelabel{#2}{white} \portsuccessor{successors} \portempty \genericnodeend }
@@ -51,7 +51,9 @@
 \newcommand{\nodequad}[2]{\genericnodestart{#1} \portinput{in1} \portinput{in2} \portinput{in3} \portinput{in4} \genericnodelabel{#2}{white} \portsuccessor{successors} \portempty \portempty \portempty \genericnodeend }
 \newcommand{\nodesplit}[2]{\genericnodestart{#1} \portempty \portinput{inputs} \genericnodelabel{#2}{white} \portsuccessor{succ1} \portsuccessor{succ2} \genericnodeend }
 \newcommand{\nodequadsplit}[2]{\genericnodestart{#1} \portinput{in1} \portinput{in2} \portinput{in3} \portinput{in4} \genericnodelabel{#2}{white} \portsuccessor{succ1} \portsuccessor{succ2} \portempty \portempty \genericnodeend }
+\newcommand{\nodetrisplit}[2]{\genericnodestart{#1} \portinput{in1} \portinput{in2} \portinput{in3} \genericnodelabel{#2}{white} \portsuccessor{succ1} \portsuccessor{succ2} \portempty \genericnodeend }
 \newcommand{\nodesplittri}[2]{\genericnodestart{#1} \portempty \portinput{inputs} \genericnodelabel{#2}{white} \portsuccessor{succ1} \portsuccessor{succ2} \portsuccessor{succ3} \genericnodeend }
+
 \newcommand{\nodetrap}[2]{\cnodebi{#1}{#2}{rosybrown1}}

 \newcommand{\cnode}[3]{\genericnodestart{#1} \portempty \portinput{inputs} \genericnodelabel{#2}{#3} \portsuccessor{successors} \portempty \genericnodeend }
@@ -59,6 +61,9 @@
 \newcommand{\cnodetri}[3]{\genericnodestart{#1} \portinput{in1} \portinput{in2} \portinput{in3} \genericnodelabel{#2}{#3} \portsuccessor{successors} \portempty \portempty \genericnodeend }
 \newcommand{\cnodesplit}[3]{\genericnodestart{#1} \portempty \portinput{inputs} \genericnodelabel{#2}{#3} \portsuccessor{succ1} \portsuccessor{succ2} \genericnodeend }

+% this doesn't work:
+%\newenvironment{graphfigure}[2]{\begin{figure}[h] \label{fig:#1} \centering \begin{digraphenv}{scale=0.5}{#1}}{\end{digraphenv} \caption{#2} \end{figure}}
+
 %%%%%%%%%%%%%% example:

 % \begin{digraphenv}{scale=0.5}{MyGraph}