annotate doc/design/graal_compiler_design.tex @ 2517:8c6e31c62fba

added initial version of design docs, fixed .hgignore (regex, . -> \.)
author Lukas Stadler <lukas.stadler@jku.at>
date Wed, 27 Apr 2011 15:59:38 +0200
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2517
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
1 \section{Nodes and Graphs}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
2 The most important aspect of a compiler is the data structure that holds information about an executable piece of code, called \emph{intermediate representation}~(IR).
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
3 The IR used in the Graal Compiler (simply refered to as \emph{the compiler} in the rest of this document) was designed in such a way as to allow for extensive optimizations, easy traversal, compact storage and efficient processing.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
4
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
5 \subsection{The Node Data Structure}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
6 \begin{itemize}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
7 \item Each node is always associated with a graph.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
8 \item Each node has an immutable id which is unique within its associated graph.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
9 \item Nodes represent either operations on values or control flow operations.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
10 \item Nodes can have a data dependency, which means that one node requires the result of some other node as its input. The fact that the result of the first node needs to be computed before the second node can be executed introduces a partial order to the set of nodes.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
11 \item Nodes can have a control flow dependency, which means that the execution of one node depends on some other node. This includes conditional execution, memory access serialization and other reasons, and again introduces a partial order to the set of nodes.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
12 \item Nodes can only have data and control dependencies to nodes which belong to the same graph.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
13 \item Control dependencies and data dependencies each represent a \emph{directed acyclic graph} (DAG) on the same set of nodes. This means that data dependencies always point upwards, and control dependencies always point downwards. Situations that are normally incurr cycles (like loops) are represented by special nodes (like LoopEnd).
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
14 \item Ordering between nodes is specified only to the extent which is required to correctly express the semantics of a given program. Some compilers (notably the HotSpot client compiler) always maintain a complete order for all nodes (called \emph{scheduling}), which impedes advanced optimizations. For algorithms that require a fixed ordering of nodes, a temporary schedule can always be generated.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
15 \item Both data and control dependencies can be traversed in both directions, so that each node can be traversed in four directions:
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
16 \begin{itemize}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
17 \item \emph{inputs} are all nodes that this node has data dependencies on.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
18 \item \emph{usages} are all nodes that have data dependencies on this node, this is regarded as the inverse of inputs.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
19 \item \emph{successors} are all nodes that have a control dependency on this node.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
20 \item \emph{predecessors} are all nodes that this node has control dependencies on, this is regarded as the inverse of successors.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
21 \end{itemize}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
22 \item Only inputs and successors can be changed, and changes to them will update the usages and predecessors.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
23 \item The Node class needs to provide facilities for subclasses to perform actions upon cloning, dependency changes, etc.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
24 \item Nodes cannot be reassigned to another graph, they are cloned instead
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
25 \end{itemize}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
26
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
27 \subsection{The Graph Data Structure}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
28 \begin{itemize}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
29 \item A graph deals out ids for new nodes and can be queried for the node corresponding to a given id.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
30 \item Graphs can manage side data structures, which will be automatically invalidated and lazily recomputed whenever the graph changes. Examples for side data structures are dominator trees and temporary schedules. These side data structures will usually be understood by more than one optimization.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
31 \item Graphs are
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
32 \end{itemize}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
33
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
34 \section{Frame States}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
35 Frame states capture the state of the program, in terms of the source representation (e.g., Java state: local variables, expressions, ...).
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
36 Whenever a safepoint is reached or the a deoptimization is needed a valid frame state needs to be available.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
37 A frame state is valid as long as the program performs only actions that can safely be reexecuted (e.g., operations on local variables, etc.).
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
38 Thus, frame states need only be generated for bytecodes that can not be reexecuted: putfield, astore, invoke, monitorenter/exit, ...
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
39
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
40 Within the node graph a frame state is represented as a node that is fixed between the node that caused it to be generated (data dependency) and the node that will invalidate it (control dependency).
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
41
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
42 Deopmization nodes are not fixed to a certain frame state node, they can move around freely and will always use the correct frame state.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
43 At some point during the compilation deoptimization nodes need to be fixed, which means that appropriate data and control dependencies will be inserted so that it can not move outside the scope of the associated frame state.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
44 This will generate deoptimization-free zones that can be targeted by the most aggressive optimizations.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
45
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
46 Frame states should be represented using a delta-encoding.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
47 This will make them significantly smaller and will make inlining, etc. much easier.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
48 In later compilation phases unnecessary frame states might be removed, using a mark-and-merge algorithm.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
49
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
50
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
51
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
52 \section{Graph Building}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
53 \begin{itemize}
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
54 \item The graph built by the initial parser (called \emph{GraphBuilder}) should be as close to the source representation (bytecode, ...) as possible.
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
55 \item All nodes should be able to immediately lower themselves to a machine-level representation. This allows for easier compiler development, and also leads to a compiler that is very flexible in the amount of optimizations it performs (e.g. recompilation of methods with more aggressive optimizations).
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
56 \item
8c6e31c62fba added initial version of design docs, fixed .hgignore (regex, . -> \.)
Lukas Stadler <lukas.stadler@jku.at>
parents:
diff changeset
57 \end{itemize}