comparison doc/design/graal_compiler.tex @ 2604:c9b17ac5c06b

Doc fixes.
author Thomas Wuerthinger <thomas@wuerthinger.net>
date Thu, 05 May 2011 17:03:43 +0200
parents e4395464810e
children 15774da89658
comparison
equal deleted inserted replaced
2599:f713d83b5a6b 2604:c9b17ac5c06b
46 \institute{\Sa Oracle, \Sc Johannes Kepler University, Linz} 46 \institute{\Sa Oracle, \Sc Johannes Kepler University, Linz}
47 47
48 \date{Created: \today} 48 \date{Created: \today}
49 49
50 \title{The Graal Compiler} 50 \title{The Graal Compiler}
51 \subtitle{Design and Strategy \\ \textcolor{red}{work in progress}} 51 \subtitle{Design and Strategy \\ \textcolor{red}{work in progress (Oracle internal)}}
52 52
53 \maketitle 53 \maketitle
54 54
55 \abstract{ 55 \abstract{
56 The Graal compiler (simply referred to as \emph{the compiler} in the rest of this document) aims at improving C1X, the Java port of the HotSpot client compiler, both in terms of modularity and peak performance. 56 The Graal compiler (simply referred to as \emph{the compiler} in the rest of this document) aims at improving C1X, the Java port of the HotSpot client compiler, both in terms of modularity and peak performance.
58 This document contains information about the proposed design and strategy for developing the compiler.} 58 This document contains information about the proposed design and strategy for developing the compiler.}
59 59
60 \section{Context} 60 \section{Context}
61 61
62 In 2009, the Maxine team started with creating C1X, a Java port of the HotSpot client compiler, and integrate it into the Maxine VM. 62 In 2009, the Maxine team started with creating C1X, a Java port of the HotSpot client compiler, and integrate it into the Maxine VM.
63 Part of this effort, was the development of a clear and clean compiler-runtime interface that allows the separation of the compiler and the VM that enables the use of one compiler for multiple VMs. 63 Part of this effort was the development of a clear and clean compiler-runtime interface that allows the separation of the compiler and the VM.
64 This compiler-runtime interface enables the use of one compiler for multiple VMs.
64 In June 2010, we started integrating C1X into the HotSpot VM and we called the resulting system Graal~VM. 65 In June 2010, we started integrating C1X into the HotSpot VM and we called the resulting system Graal~VM.
65 Currently, the Graal~VM is fully functional and runs benchmarks (SciMark, DaCapo) at a similar speed to the HotSpot client compiler. 66 Currently, the Graal~VM is fully functional and runs benchmarks (SciMark, DaCapo) at a similar speed to the HotSpot client compiler.
66 67
67 \section{Goals} 68 \section{Goals}
68 The compiler effort aims at rewriting the high-level intermediate representation of C1X with two main goals: 69 The compiler effort aims at rewriting the high-level intermediate representation of C1X with two main goals:
113 114
114 After those four milestones, we see three different possible further development directions that can be followed in parallel: 115 After those four milestones, we see three different possible further development directions that can be followed in parallel:
115 \begin{itemize} 116 \begin{itemize}
116 \item Removal of the XIR template mechanism and replacement with a snippet mechanism that works with the compiler graph. 117 \item Removal of the XIR template mechanism and replacement with a snippet mechanism that works with the compiler graph.
117 \item Improvements for peak performance (loop optimizations, escape analysis, bounds check elimination, processing additional interpreter runtime feedback). 118 \item Improvements for peak performance (loop optimizations, escape analysis, bounds check elimination, processing additional interpreter runtime feedback).
118 \item Implementation of a prototype front-end for different languages, e.g., JavaScript. 119 \item Implementation of a prototype front-end for a different language, e.g., JavaScript.
119 \end{itemize} 120 \end{itemize}
120 121
121 \section{Graph} 122 \section{Graph}
122 123
123 The \emph{intermediate representation}~(IR) of the compiler is designed as a directed graph. 124 The \emph{intermediate representation}~(IR) of the compiler is designed as a directed graph.
124 A graph deals out ids for new nodes and can be queried for the node corresponding to a given id as well as for an unordered list of nodes of the graph. 125 The graph deals out ids for new nodes and can be queried for the node corresponding to a given id as well as for an unordered list of nodes of the graph.
125 Graphs can manage side data structures, which will be automatically invalidated and lazily recomputed whenever the graph changes. Examples for side data structures are dominator trees and temporary schedules. These side data structures will usually be understood by more than one optimization. 126 Graphs can manage side data structures, which will be automatically invalidated and lazily recomputed whenever the graph changes. Examples for side data structures are dominator trees and temporary schedules. These side data structures will usually be understood by more than one optimization.
126 127
127 The nodes of the graph have the following properties: 128 The nodes of the graph have the following properties:
128 \begin{itemize} 129 \begin{itemize}
129 \item Each node is always associated with a single graph and this association is immutable. 130 \item Each node is always associated with a single graph and this association is immutable.
130 \item Each node has an immutable id that is unique within its associated graph. 131 \item Each node has an immutable id that is unique within its associated graph.
131 \item Nodes can have a data dependency, which means that one node requires the result of some other node as its input. The fact that the result of the first node needs to be computed before the second node can be executed introduces a partial order to the set of nodes. 132 \item Nodes can have a data dependency, which means that one node requires the result of another node as its input. The fact that the result of the first node needs to be computed before the second node can be executed introduces a partial order to the set of nodes.
132 \item Nodes can have a control flow dependency, which means that the execution of one node will be followed by the execution of another node. This includes conditional execution, memory access serialization and other reasons, and again introduces a partial order to the set of nodes. 133 \item Nodes can have a control flow dependency, which means that the execution of one node will be followed by the execution of another node. This includes conditional execution, memory access serialization and other reasons, and again introduces a partial order to the set of nodes.
133 \item Nodes can only have data and control dependencies to nodes which belong to the same graph. 134 \item Nodes can only have data and control dependencies to nodes which belong to the same graph.
134 \item Control dependencies and data dependencies each represent a \emph{directed acyclic graph} (DAG) on the same set of nodes. This means that data dependencies always point upwards, and control dependencies always point downwards. Situations that are normally incur cycles (like loops) are represented by special nodes (see Section~\ref{sec:loops}). 135 \item Control dependencies and data dependencies each represent a \emph{directed acyclic graph} (DAG) on the same set of nodes. This means that data dependencies always point upwards, and control dependencies always point downwards in a drawing of the graph. Situations that are normally incur cycles (like loops) are represented by special nodes (see Section~\ref{sec:loops}).
135 \item Ordering between nodes is specified only to the extent which is required to correctly express the semantics of a given program. This gives the compiler flexibility for the possible scheduling of a node and therefore wriggle room for optimizations. For algorithms that require a fixed ordering of nodes, a temporary schedule can always be generated. 136 \item Ordering between nodes is specified only to the extent which is required to correctly express the semantics of a given program. This gives the compiler flexibility for the possible scheduling of a node and therefore wriggle room for optimizations. For algorithms that require a fixed ordering of nodes, a temporary schedule can always be generated.
136 \item Both data and control dependencies can be traversed in both directions, so that each node can be traversed in four directions (see Figure~\ref{fig:directions}): 137 \item Both data and control dependencies can be traversed in both directions, so that each node can be traversed in four directions (see Figure~\ref{fig:directions}):
137 \begin{itemize} 138 \begin{itemize}
138 \item \emph{inputs} are all nodes that this node has data dependencies on. 139 \item \emph{inputs} are all nodes that this node has data dependencies on.
139 \item \emph{usages} are all nodes that have data dependencies on this node, this is regarded as the inverse of inputs. 140 \item \emph{usages} are all nodes whose inputs contain this node.
140 \item \emph{successors} are all nodes that have a control dependency on this node. 141 \item \emph{successors} are all nodes that have to be after this node in control flow.
141 \item \emph{predecessors} are all nodes that this node has control dependencies on, this is regarded as the inverse of successors. 142 \item \emph{predecessors} are all nodes whose successors contain this node.
142 \end{itemize} 143 \end{itemize}
143 \item Only inputs and successors can be changed, and changes to them will update the usages and predecessors. 144 \item Only inputs and successors can be changed, and changes to them will update the usages and predecessors.
144 \item Every node must be able to support cloning and serialization. 145 \item Every node must be able to support cloning and serialization.
145 \item Inlining should always be performed as embedding one graph into another graph. 146 \item Inlining should always be performed as embedding one graph into another graph.
146 \item Nodes cannot be reassigned to another graph, they are cloned instead. 147 \item Nodes cannot be reassigned to another graph, they are cloned instead.
147 \item The edges of a node also define \textit{happens-before} and \textit{happens-after} relationships as shown in Figure~\ref{fig:directions}. 148 \item The edges of a node also define \textit{happens-before} and \textit{happens-after} relationships as shown in Figure~\ref{fig:directions}.
148 \end{itemize} 149 \end{itemize}
149 150
150 \begin{figure}[h] 151 \begin{figure}[h]
151 \label{fig:directions}
152 \centering 152 \centering
153 \begin{digraphenv}{scale=0.5}{graphdirections} 153 \begin{digraphenv}{scale=0.5}{graphdirections}
154 \node{node1}{Node} 154 \node{node1}{Node}
155 \textnode{inputs}{inputs} 155 \textnode{inputs}{inputs}
156 \textnode{usages}{usages} 156 \textnode{usages}{usages}
159 \data{node1}{inputs} 159 \data{node1}{inputs}
160 \control{node1}{successors} 160 \control{node1}{successors}
161 \data{usages}{node1} 161 \data{usages}{node1}
162 \control{predecessors}{node1} 162 \control{predecessors}{node1}
163 \node{node2}{Node} 163 \node{node2}{Node}
164 \textnode{before}{happens before} 164 \textnode{before}{happens-before}
165 \textnode{after}{happens after} 165 \textnode{after}{happens-after}
166 \data{node2}{before} 166 \data{node2}{before}
167 \control{node2}{after} 167 \control{node2}{after}
168 \data{after}{node2} 168 \data{after}{node2}
169 \control{before}{node2} 169 \control{before}{node2}
170 \end{digraphenv} 170 \end{digraphenv}
171 \caption{A node and its edges.} 171 \caption{A node and its edges.}
172 \label{fig:directions}
172 \end{figure} 173 \end{figure}
173 174
174 \subsection{Project Source Structure} 175 \subsection{Project Source Structure}
175 In order to support the goal of a modular compiler, the code will be divided into the following source code projects (as subprojects of \textbf{com.oracle.graal}). 176 In order to support the goal of a modular compiler, the code will be divided into the following source code projects (as subprojects of \textbf{com.oracle.graal}).
176 177
177 \begin{description} 178 \begin{description}
178 \item[graph] contains the abstract node implementation, the graph implementation and all the associated tools and auxiliary classes. 179 \item[graph] contains the abstract node implementation, the graph implementation and all the associated tools and auxiliary classes.
179 \item[nodes] contains the implementation of known basic nodes (e.g., phi nodes, control flow nodes, \ldots). 180 \item[nodes] contains the implementation of known basic nodes (e.g., phi nodes, control flow nodes, \ldots).
180 Additional node classes should go into seperate projects and be specializations of the known basic nodes.] 181 Additional node classes should go into seperate projects and be specializations of the known basic nodes.
181 \item[java] contains code for building graphs from Java bytecodes and Java-specific nodes. 182 \item[java] contains code for building graphs from Java bytecodes and Java-specific nodes.
182 \item[opt] contains optimizations such as global value numbering or conditional constant propagation. 183 \item[opt] contains optimizations such as global value numbering or conditional constant propagation.
183 \item[compiler] contains the compiler, including: 184 \item[compiler] contains the compiler, including:
184 \begin{itemize} 185 \begin{itemize}
185 \item Schedules the compilation phases. 186 \item Scheduling of the compilation phases.
186 \item Implementation of the \emph{compiler interface} (CI). 187 \item Implementation of the \emph{compiler interface} (CI).
187 \item Implements the final compilation phase that produces the low-level representation. 188 \item Implements the final compilation phase that produces the low-level representation.
188 \item Machine code creation, including debug info. 189 \item Machine code creation, including debug info.
189 \end{itemize} 190 \end{itemize}
190 \end{description} 191 \end{description}
199 It goes from the beginning to the end in order to make the graph acyclic. 200 It goes from the beginning to the end in order to make the graph acyclic.
200 An algorithm that traverses the control flow has to explicitely decide whether it wants to incorporate backedges (i.e., special case the treatment of \nodename{LoopEnd}) or ignore them. 201 An algorithm that traverses the control flow has to explicitely decide whether it wants to incorporate backedges (i.e., special case the treatment of \nodename{LoopEnd}) or ignore them.
201 Figure \ref{fig:loop1} shows a simple example with a loop with a single entry and two exits. 202 Figure \ref{fig:loop1} shows a simple example with a loop with a single entry and two exits.
202 203
203 \begin{figure}[h] 204 \begin{figure}[h]
204 \label{fig:loop1}
205 \centering 205 \centering
206 \begin{digraphenv}{scale=0.5}{layout1} 206 \begin{digraphenv}{scale=0.5}{layout1}
207 \textnode{BeforeLoop}{Loop entry} 207 \textnode{BeforeLoop}{Loop entry}
208 \textnode{Exit1}{First loop exit} 208 \textnode{Exit1}{First loop exit}
209 \textnode{Exit2}{Second loop exit} 209 \textnode{Exit2}{Second loop exit}
218 \controllabel{BeforeLoop}{LoopBegin} 218 \controllabel{BeforeLoop}{LoopBegin}
219 \controllabel{If1:succ2}{Exit1} 219 \controllabel{If1:succ2}{Exit1}
220 \controllabel{If2:succ2}{Exit2} 220 \controllabel{If2:succ2}{Exit2}
221 \end{digraphenv} 221 \end{digraphenv}
222 \caption{A simple loop with two exits.} 222 \caption{A simple loop with two exits.}
223 \label{fig:loop1}
223 \end{figure} 224 \end{figure}
224 225
225 \subsection{Loop Phis} 226 \subsection{Loop Phis}
226 Data flow in loops is modelled with special phi nodes at the beginning and the end of the loop. 227 Data flow in loops is modelled with special phi nodes at the beginning and the end of the loop.
227 The \nodename{LoopEnd} node merges every value that flows into the next loop iteration in associated \nodename{LoopEndPhi} nodes. 228 The \nodename{LoopEnd} node merges every value that flows into the next loop iteration in associated \nodename{LoopEndPhi} nodes.
228 A corresponding \nodename{LoopBeginPhi} node that is associated with the loop header has a control flow dependency on the \nodename{LoopEndPhi} node. 229 A corresponding \nodename{LoopBeginPhi} node that is associated with the loop header has a control flow dependency on the \nodename{LoopEndPhi} node.
229 Figure \ref{fig:loop2} shows how a simple counting loop is modelled in the graph. 230 Figure \ref{fig:loop2} shows how a simple counting loop is modelled in the graph.
230 231
231 \begin{figure}[h] 232 \begin{figure}[h]
232 \label{fig:loop2}
233 \centering 233 \centering
234 \begin{digraphenv}{scale=0.5}{layout2} 234 \begin{digraphenv}{scale=0.5}{layout2}
235 \textnode{BeforeLoop}{Loop entry} 235 \textnode{BeforeLoop}{Loop entry}
236 \textnode{Exit}{Loop exit} 236 \textnode{Exit}{Loop exit}
237 \textnode{n}{n} 237 \textnode{n}{n}
259 \controllabel{If1:succ1}{LoopEnd} 259 \controllabel{If1:succ1}{LoopEnd}
260 \controllabel{BeforeLoop}{LoopBegin} 260 \controllabel{BeforeLoop}{LoopBegin}
261 \controllabel{If1:succ2}{Exit} 261 \controllabel{If1:succ2}{Exit}
262 \end{digraphenv} 262 \end{digraphenv}
263 \caption{Graph for a loop counting from 0 to n-1.} 263 \caption{Graph for a loop counting from 0 to n-1.}
264 \label{fig:loop2}
264 \end{figure} 265 \end{figure}
265 266
266 \subsection{Loop Counters} 267 \subsection{Loop Counters}
267 The compiler is capable of recognizing variables that are only increased within a loop. 268 The compiler is capable of recognizing variables that are only increased within a loop.
268 A potential overflow of such a variable is guarded with a trap before the loop. 269 A potential overflow of such a variable is guarded with a trap before the loop.
269 Figure \ref{fig:loop3} shows the compiler graph of the example loop after the loop counter transformation. 270 Figure \ref{fig:loop3} shows the compiler graph of the example loop after the loop counter transformation.
270 271
271 272
272 \begin{figure}[h] 273 \begin{figure}[h]
273 \label{fig:loop3}
274 \centering 274 \centering
275 \begin{digraphenv}{scale=0.5}{layout3} 275 \begin{digraphenv}{scale=0.5}{layout3}
276 \textnode{BeforeLoop}{Loop entry} 276 \textnode{BeforeLoop}{Loop entry}
277 \textnode{Exit}{Loop exit} 277 \textnode{Exit}{Loop exit}
278 \textnode{n}{n} 278 \textnode{n}{n}
294 \controllabel{If1:succ1}{LoopEnd} 294 \controllabel{If1:succ1}{LoopEnd}
295 \controllabel{BeforeLoop}{LoopBegin} 295 \controllabel{BeforeLoop}{LoopBegin}
296 \controllabel{If1:succ2}{Exit} 296 \controllabel{If1:succ2}{Exit}
297 \end{digraphenv} 297 \end{digraphenv}
298 \caption{Graph after loop counter transformation.} 298 \caption{Graph after loop counter transformation.}
299 \label{fig:loop3}
299 \end{figure} 300 \end{figure}
300 301
301 \subsection{Bounded Loops} 302 \subsection{Bounded Loops}
302 303
303 If the total maximum number of iterations of a loop is fixed, then the loop is converted into a bounded loop. 304 If the total maximum number of iterations of a loop is fixed, then the loop is converted into a bounded loop.
304 The total number of iterations always denotes the number of full iterations of the loop with the control flowing from the loop begin to the loop end. 305 The total number of iterations always denotes the number of full iterations of the loop with the control flowing from the loop begin to the loop end.
305 If the totel number of iterations is reached, the loop is exited directly from the loop header. 306 If the total number of iterations is reached, the loop is exited directly from the loop header.
306 In the example, we can infer from the loop exit with the comparison on the loop counter that the total number of iterations of the loop is limited to n. 307 In the example, we can infer from the loop exit with the comparison on the loop counter that the total number of iterations of the loop is limited to n.
307 Figure \ref{fig:loop4} shows the compiler graph of the example loop after the bounded loop transformation. 308 Figure \ref{fig:loop4} shows the compiler graph of the example loop after the bounded loop transformation.
308 309
309 \begin{figure}[h] 310 \begin{figure}[h]
310 \label{fig:loop4}
311 \centering 311 \centering
312 \begin{digraphenv}{scale=0.5}{layout4} 312 \begin{digraphenv}{scale=0.5}{layout4}
313 \textnode{BeforeLoop}{Loop entry} 313 \textnode{BeforeLoop}{Loop entry}
314 \textnode{Exit}{Loop exit} 314 \textnode{Exit}{Loop exit}
315 \textnode{n}{n} 315 \textnode{n}{n}
326 \datalabeltext{LoopCounter:in3}{Constant1}{stride} 326 \datalabeltext{LoopCounter:in3}{Constant1}{stride}
327 \data{LoopBegin}{n} 327 \data{LoopBegin}{n}
328 \controllabel{BeforeLoop}{LoopBegin} 328 \controllabel{BeforeLoop}{LoopBegin}
329 \end{digraphenv} 329 \end{digraphenv}
330 \caption{Graph after bounded loop transformation.} 330 \caption{Graph after bounded loop transformation.}
331 \label{fig:loop4}
331 \end{figure} 332 \end{figure}
332 333
333 \subsection{Vectorization} 334 \subsection{Vectorization}
334 335
335 If we have now a bounded loop with no additional loop exit and no associated phi nodes (only associated loop counters), we can vectorize the loop. 336 If we have now a bounded loop with no additional loop exit and no associated phi nodes (only associated loop counters), we can vectorize the loop.
336 We replace the loop header with a normal instruction that produces a vector of values from 0 to the number of loop iterations minus 1. 337 We replace the loop header with a normal instruction that produces a vector of values from 0 to the number of loop iterations minus 1.
337 The loop counters are replaced with \texttt{VectorAdd} and \texttt{VectorMul} nodes. 338 The loop counters are replaced with \texttt{VectorAdd} and \texttt{VectorMul} nodes.
338 The vectorization is only possible if every node of the loop can be replaced with a corresponding vector node. 339 The vectorization is only possible if every node of the loop can be replaced with a corresponding vector node.
339 Figure \ref{fig:loop5} shows the compiler graph of the example loop after vectorization. 340 Figure \ref{fig:loop5} shows the compiler graph of the example loop after vectorization.
340 The vector nodes all work on an ordered list of integer values and are subject to canonicalization like any other node. 341 The vector nodes all work on an ordered list of integer values and are subject to canonicalization and global value numbering like any other node.
341 342
342 343
343 \begin{figure}[h] 344 \begin{figure}[h]
344 \label{fig:loop5}
345 \centering 345 \centering
346 \begin{digraphenv}{scale=0.5}{layout5} 346 \begin{digraphenv}{scale=0.5}{layout5}
347 \textnode{Entry}{Entry} 347 \textnode{Entry}{Entry}
348 \textnode{Exit}{Exit} 348 \textnode{Exit}{Exit}
349 \textnode{n}{n} 349 \textnode{n}{n}
359 \datalabel{VectorMul:in1}{VectorAdd} 359 \datalabel{VectorMul:in1}{VectorAdd}
360 \datalabel{VectorMul:in2}{Constant1} 360 \datalabel{VectorMul:in2}{Constant1}
361 \data{Vector}{n} 361 \data{Vector}{n}
362 \end{digraphenv} 362 \end{digraphenv}
363 \caption{Graph after vectorization.} 363 \caption{Graph after vectorization.}
364 \label{fig:loop5}
364 \end{figure} 365 \end{figure}
365 366
366 367
367 \section{Frame States} 368 \section{Frame States}
368 A frame state captures the state of the program in terms of the Java bytecode specification (i.e., the values of the local variables, the operand stack, and the locked monitors). 369 A frame state captures the state of the program in terms of the Java bytecode specification (i.e., the values of the local variables, the operand stack, and the locked monitors).
369 Every deoptimization point needs a valid frame state. 370 Every deoptimization point needs a valid frame state.
370 A frame state is valid as long as the program performs only actions that can safely be reexecuted (e.g., operations on local variables, loads, etc.). 371 A frame state is valid as long as the program performs only actions that can safely be reexecuted.
371 Thus, frame states need only be generated for bytecodes that cannot be reexecuted: 372 Thus, frame states need only be generated for the states after bytecodes that cannot be reexecuted:
372 373
373 \begin{itemize} 374 \begin{itemize}
374 \item Array stores: {\tt IASTORE, LASTORE, FASTORE, \\DASTORE, AASTORE, BASTORE, CASTORE, SASTORE} 375 \item Array stores: {\tt IASTORE, LASTORE, FASTORE, \\DASTORE, AASTORE, BASTORE, CASTORE, SASTORE}
375 \item Field stores: {\tt PUTSTATIC, PUTFIELD} 376 \item Field stores: {\tt PUTSTATIC, PUTFIELD}
376 \item Method calls: {\tt INVOKEVIRTUAL, INVOKESPECIAL, \\INVOKESTATIC, INVOKEINTERFACE} 377 \item Method calls: {\tt INVOKEVIRTUAL, INVOKESPECIAL, \\INVOKESTATIC, INVOKEINTERFACE}
377 \item Synchronization: {\tt MONITORENTER, MONITOREXIT} 378 \item Synchronization: {\tt MONITORENTER, MONITOREXIT}
378 \end{itemize} 379 \end{itemize}
379 380
380 Within the node graph a frame state is represented as a node that is fixed to the node that caused it to be generated using a control dependency (see Figure~\ref{fig:fs1}). 381 Within the graph a frame state is represented as a node that is fixed to the node that caused it to be generated using a control dependency (see Figure~\ref{fig:fs1}).
381 Frame states also have data dependencies on the contents of the state: the local variables and the expression stack. 382 Frame states also have data dependencies on the contents of the state: the local variables and the expression stack.
382 383
383 384 The frame state at the method beginning does not have to be explicitely in the graph, because it can always be reconstructed at a later stage.
384 \begin{figure}[h] 385 We save the frame state at control flow merges if there is at least one frame state on any control flow path since the immediate dominator.
385 \label{fig:fs1} 386
387
388 \begin{figure}[h]
386 \centering 389 \centering
387 \begin{digraphenv}{scale=0.5}{fs1} 390 \begin{digraphenv}{scale=0.5}{fs1}
388 \nodetrisplit{store1}{ArrayStore} 391 \nodetrisplit{store1}{ArrayStore}
389 \nodebi{load1}{ArrayLoad} 392 \nodebi{load1}{ArrayLoad}
390 \controllabel{store1:succ1}{load1} 393 \controllabel{store1:succ1}{load1}
397 \controllabel{store1:succ2}{fs1} 400 \controllabel{store1:succ2}{fs1}
398 \nodeframestate{fs2}{FrameState} 401 \nodeframestate{fs2}{FrameState}
399 \controllabel{store2:succ2}{fs2} 402 \controllabel{store2:succ2}{fs2}
400 \end{digraphenv} 403 \end{digraphenv}
401 \caption{Simple example using two frame states.} 404 \caption{Simple example using two frame states.}
405 \label{fig:fs1}
402 \end{figure} 406 \end{figure}
403 407
404 408
405 \subsection{Traps} 409 \subsection{Traps}
406 A trap node is a node that deoptimizes based on a conditional expression. 410 A trap node is a node that deoptimizes based on a conditional expression.
407 Trap nodes are not fixed to a certain frame state node, they can move around freely and will always use the correct frame state. 411 Trap nodes are not fixed to a certain frame state node, they can move around freely and will always use the correct frame state when the nodes are scheduled (i.e., the last emitted frame state).
408 The node that is guarded by the deoptimization has a data dependency on the trap, and the trap in turn has a data dependency on the condition and on the most distant node that is postdominated by the guarded node. 412 The node that is guarded by the deoptimization has a data dependency on the trap, and the trap in turn has a data dependency on the condition and on the most distant node that is postdominated by the guarded node.
409 413
410 In the example of Figure~\ref{fig:trap1}, the second load is guarded by a trap, because its receiver might be null (the receiver of the first load is assumed to be non-null). 414 In the example of Figure~\ref{fig:trap1}, the second load is guarded by a trap, because its receiver might be null (the receiver of the first load is assumed to be non-null).
411 The trap is anchored to the control split, because as soon as this node is executed the second load must be executed as well. 415 The trap is anchored to the control split, because as soon as this node is executed, the second load must be executed as well.
412 In the final scheduling the trap can be placed before or after the first load. 416 In the final schedule, the trap can be placed before or after the first load.
413 417
414 \begin{figure}[h] 418 \begin{figure}[h]
415 \label{fig:trap1}
416 \centering 419 \centering
417 \begin{digraphenv}{scale=0.5}{trap1} 420 \begin{digraphenv}{scale=0.5}{trap1}
418 \nodesplit{if}{If} 421 \nodesplit{if}{If}
419 \node{split1}{Split} 422 \node{split1}{Split}
420 \controllabel{if:succ1}{split1} 423 \controllabel{if:succ1}{split1}
441 \datalabel{trap:in2}{cmpnull} 444 \datalabel{trap:in2}{cmpnull}
442 \datalabel{load2:in1}{trap} 445 \datalabel{load2:in1}{trap}
443 \datalabel{trap:in1}{split2} 446 \datalabel{trap:in1}{split2}
444 \end{digraphenv} 447 \end{digraphenv}
445 \caption{A load guarded by a null check trap.} 448 \caption{A load guarded by a null check trap.}
446 \end{figure} 449 \label{fig:trap1}
447 450 \end{figure}
448 Another type of trap is a guard, which is used to remove branches that have a very low execution frequency from the compiled code. 451
452 Another type of trap is a guard, which is used to remove branches that have a very low execution frequency and replace them with a conditional jump to deoptimization.
449 453
450 In the example of Figure~\ref{fig:trap2}, an \texttt{If} node was replaced by a guard and a trap. 454 In the example of Figure~\ref{fig:trap2}, an \texttt{If} node was replaced by a guard and a trap.
451 The guard takes the place of the \texttt{If} node in the control flow, and is connected to the trap node. 455 The guard takes the place of the \texttt{If} node in the control flow, and is connected to the trap node.
452 The trap is now anchored to the most distant node of which the \texttt{If} node was a postdominator. 456 The trap is now anchored to the most distant node of which the \texttt{If} node was a postdominator.
453 457
454 \begin{figure}[h] 458 \begin{figure}[h]
455 \label{fig:trap2}
456 \centering 459 \centering
457 \begin{digraphenv}{scale=0.5}{trap2} 460 \begin{digraphenv}{scale=0.5}{trap2}
458 start [shape=plaintext, label="..."] 461 start [shape=plaintext, label="..."]
459 start2 [shape=plaintext, label=""] 462 start2 [shape=plaintext, label=""]
460 start3 [shape=plaintext, label=""] 463 start3 [shape=plaintext, label=""]
469 \datalabel{trap:in1}{start2} 472 \datalabel{trap:in1}{start2}
470 \datalabel{trap:in2}{start3} 473 \datalabel{trap:in2}{start3}
471 \data{guard}{trap} 474 \data{guard}{trap}
472 \end{digraphenv} 475 \end{digraphenv}
473 \caption{A trap that is fixed to the control flow using a guard instruction.} 476 \caption{A trap that is fixed to the control flow using a guard instruction.}
477 \label{fig:trap2}
474 \end{figure} 478 \end{figure}
475 479
476 At some point during the compilation, trap nodes need to be fixed, which means that appropriate data and control dependencies will be inserted so that they cannot move outside the scope of the associated frame state. 480 At some point during the compilation, trap nodes need to be fixed, which means that appropriate data and control dependencies will be inserted so that they cannot move outside the scope of the associated frame state.
477 This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations. 481 This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations.
478 A simple algorithm for this removal of frame states would be to move all traps as far upwards as possible. 482 A simple algorithm for this removal of frame states would be to move all traps as far upwards as possible.
479 Multiple traps with the same condition and anchor can be merged (see Figure~\ref{fig:trap3}). 483 Multiple traps with the same condition and anchor can be merged (see Figure~\ref{fig:trap3}).
480 484
481 \begin{figure}[h] 485 \begin{figure}[h]
482 \label{fig:trap3}
483 \centering 486 \centering
484 \begin{digraphenv}{scale=0.5}{trap3} 487 \begin{digraphenv}{scale=0.5}{trap3}
485 \nodesplit{if}{If} 488 \nodesplit{if}{If}
486 \node{split1}{Split} 489 \node{split1}{Split}
487 \controllabel{if:succ1}{split1} 490 \controllabel{if:succ1}{split1}
508 \datalabel{load2:in1}{trap} 511 \datalabel{load2:in1}{trap}
509 \datalabel{load1:in1}{trap} 512 \datalabel{load1:in1}{trap}
510 \datalabel{trap:in1}{split2} 513 \datalabel{trap:in1}{split2}
511 \end{digraphenv} 514 \end{digraphenv}
512 \caption{Two loads guarded by the same trap.} 515 \caption{Two loads guarded by the same trap.}
516 \label{fig:trap3}
513 \end{figure} 517 \end{figure}
514 518
515 Also, if two traps that are anchored to the true and false branch of the same \texttt{If} node have the same condition, they can be merged, so that the resulting trap is anchored at the most distant node of which the \texttt{If} node is a postdominator. 519 Also, if two traps that are anchored to the true and false branch of the same \texttt{If} node have the same condition, they can be merged, so that the resulting trap is anchored at the most distant node of which the \texttt{If} node is a postdominator.
516 520
517 \section{Conclusions} 521 \section{Conclusions}