view graal/GraalCompiler/bin/com/sun/c1x/doc/LoopPeeling.txt @ 2509:16b9a8b5ad39

Renamings Runtime=>GraalRuntime and Compiler=>GraalCompiler
author Thomas Wuerthinger <thomas@wuerthinger.net>
date Wed, 27 Apr 2011 11:50:44 +0200
parents graal/Compiler/bin/com/sun/c1x/doc/LoopPeeling.txt@9ec15d6914ca
children
line wrap: on
line source

Remaining Issues in the Loop Peeling optimization (September/18/09)
===================================================================

Known Problems:
Currently loop peeling is not working when the loop has exception
blocks. The algorithm has to be updated to handle these cases.

Limitations:
The algorithm performs loop peeling only on innermost loops. However,
this is not a limitation. If one wants to peel the outermost loop(s),
after peeling the innermost loop, an additional pass is necessary
to update the blocks of the outermost loop, since after peeling
the innermost loop, newer blocks are added to the CFG.

Current status as of 09/18/2009
After running using hir configuration, with optimization level 3, the
following test cases produce wrong results:
(4 fails)
233: jtt/except/Catch_Loop01.java: (4) failed with unexpected com.sun.c1x.ci.CiBailout (expected -170)
234: jtt/except/Catch_Loop02.java: (4) failed with unexpected com.sun.c1x.ci.CiBailout (expected -170)
245: jtt/except/Catch_StackOverflowError_03.java: (0) failed with unexpected com.sun.c1x.ci.CiBailout (expected 0)
272: jtt/hotpath/HP_array04.java: (80) failed with unexpected java.lang.AssertionError (expected 15645)

All of them are related the known problem aforementioned.
All the tests have the innermost loops peeled, and produce right results.

Future Work:
1- Add more loop tests to jtt. New tests should run the loop 0, 1, 2 or more iterations.

2- Peel loops with exception blocks. 

3- Currently, all innermost loops are peeled. Should we add logic to filter out some loops from loop peeling?

4- Improve the way instructions are cloned. Now the algorithm visits the blocks in BFS order, which
might produce errors depending on the order the blocks are visited.

5- After inserting new phi instructions at exit blocks, we need to iterate over the remaining CFG to update instructions
that may use the new phi. The way it's done now might me inefficient if the block has more than one exit node, since
we can iterate over the same block more than once. This needs to be improved.

5- For performance reasons, improve the way loops are represented, for example, to use a bitmap to represent the loop blocks.