So I was adding support for handling next/redo/last exceptions to PCT's loops the other day, and I hit a weird bug. Here's the patch.

The problem is approximately this:

$I0 = defined, stuff_to_loop_over
unless $I0, for_loop_end
push_eh for_loop_next
...
for_loop_end:
pop_eh

I'm adding the error handler there after a conditional jump, but popping the error handler off after the target of that jump, but I didn't notice this at the time. That's going to at least cause bugs at runtime, but this also caused the PIR compiler to hang. I beat my head against it for a while, posted a bug about it, and went to sleep.

The next day, I used valgrind's callgrind tool to find where it was spending it's time. The output is:

--------------------------------------------------------------------------------
            Ir  file:function
--------------------------------------------------------------------------------
13,455,028,108  /home/sweeks/src/parrot/compilers/imcc/sets.c:set_add [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
11,209,537,812  /home/sweeks/src/parrot/compilers/imcc/cfg.c:compute_dominance_frontiers [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
   135,234,114  /home/sweeks/src/parrot/compilers/imcc/imclexer.c:yylex [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
   128,962,606  /home/sweeks/src/parrot/compilers/imcc/sets.c:set_contains [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    88,825,729  /home/sweeks/src/parrot/compilers/imcc/imcparser.c:yyparse [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    70,349,560  /home/sweeks/src/parrot/compilers/imcc/instructions.c:instruction_reads [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    65,814,257  /home/sweeks/src/parrot/compilers/imcc/cfg.c:compute_dominators [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    56,483,518  /home/sweeks/src/parrot/compilers/imcc/instructions.c:instruction_writes [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    48,013,979  ???:_int_malloc [/lib64/libc-2.8.so]
    46,419,350  /home/sweeks/src/parrot/compilers/imcc/pbc.c:constant_folding [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    33,201,086  /home/sweeks/src/parrot/compilers/imcc/sets.c:set_intersec_inplace [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    29,624,696  /home/sweeks/src/parrot/compilers/imcc/symreg.c:hash_str [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    20,719,320  ???:calloc [/lib64/ld-2.8.so]
    18,291,953  /home/sweeks/src/parrot/compilers/imcc/cfg.c:bb_check_set_addr [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    17,404,418  /home/sweeks/src/parrot/compilers/imcc/reg_alloc.c:compute_one_du_chain [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    13,809,278  /home/sweeks/src/parrot/compilers/imcc/imcc.l:yylex
    12,337,246  /home/sweeks/src/parrot/src/ops/core_ops.c:hash_str [/home/sweeks/src/parrot/blib/lib/libparrot.so.0.7.1]
    12,162,840  ???:memcpy [/lib64/ld-2.8.so]
    11,450,969  ???:strcmp'2 [/lib64/ld-2.8.so]
So it's in IMCC, the PIR compiler, specifically in compute_dominance_frontiers... which is what? So let's find out.

The POD for that function says:

=item C<void compute_dominance_frontiers>

Algorithm to find dominance frontiers described in paper "A Simple, Fast
Dominance Algorithm", Cooper et al. (2001)
A quick check with google turns up the PDF I linked.

Skimming over that and the surrounding code that calls compute_dominance_frontier reveals that it's related to register allocation. I read it another couple of times, think I understand the general ideas, and go to sleep.

The next day, I chat with pmichaud about it, and he manages to trim down a simplifed test case, and also notices the actual bug in my patch, which allows me to commit that and pass a half-dozen more tests. He passes the minimal test off to chromatic, who debugs it for a bit and finds that an inner loop in compute_dominance_frontiers is alternating between 8 and 9.

He suggests that I find a way to dump out the register and basic blocks information, and look for the edge identification between blocks. I only have a vague idea of what this means then, so I ask for more information. Here's chromatic's explanation:

<@chromatic> I can give you a quick overview.
<@chromatic> The alligator divides every compilation unit into basic blocks.
<@chromatic> A block is a single block of code between branch points.
<@chromatic> Every entrance and exit point demarcates a block.
<@chromatic> If you arrange the blocks in terms of a control flow graph, you can evaluate all of the possible ways you can reach a point within the block.
<@chromatic> You also keep track of which registers you use within a block.
<@chromatic> If all of the paths to a block go through another block, the latter dominates the former.
<@chromatic> All of this is to say, if you have a named register used in the first block of a unit and the final block of the unit can branch back to the first unit, you have to keep the physical register untouched.
<@chromatic> If there's a block beyond which you can never branch back, you can reuse physical registers unique to that branch and never used later.
<@chromatic> I've been unclear about name/physical register mapping, but I think you get the picture.
<@chromatic> It's the mapping of name to register that really matters.

So I start browsing that code again, but fall asleep before doing anything. Do you see the pattern here yet? ;)

The next day, I go look around in there and I find a couple of debug functions I can use to dump relevant information. Here's what I find:

Dumping the CFG:
-------------------------------
0 (3)	 -> 8 1 2 		 <- 
1 (3)	 -> 2 		 <- 0 
2 (3)	 -> 3 9 		 <- 1 0 
3 (2)	 -> 4 		 <- 2 
4 (2)	 -> 5 		 <- 3 
5 (3)	 -> 6 9 		 <- 8 7 4 
6 (3)	 -> 7 		 <- 5 
7 (3)	 -> 5 		 <- 6 
8 (2)	 -> 5 		 <- 9 0 
9 (3)	 -> 8 		 <- 5 2 
This is:
block number (loop depth) -> blocks that this block can branch to    <- blocks that can branch to this block
We also have:
Dumping the Dominators Tree:
-------------------------------
 0 <- ( 0)  0
 1 <- ( 0)  0  1  8  9
 2 <- ( 0)  0  2  8  9
 3 <- ( 2)  0  2  3  8  9
 4 <- ( 3)  0  2  3  4  8  9
 5 <- ( 0)  0  5  8  9
 6 <- ( 5)  0  5  6  8  9
 7 <- ( 6)  0  5  6  7  8  9
 8 <- ( 9)  0  8  9
 9 <- ( 8)  0  8  9
This is:
block number <- (immediate dominator) full list of dominators
The loop that was spinning was:
while (runner >= 0 && runner != unit->idoms[b]) {
    /* add b to runner's dominance frontier set */
    set_add(unit->dominance_frontiers[runner], b);

    /* runner = idoms[runner] */
    if (runner == 0)
        runner = -1;
    else
        runner = unit->idoms[runner];
}

Where 'runner' is the node we're currently evaluating, unit is the overall compilation unit, and idoms[] is the list of immediate dominators.

Adding some debugging printfs, I discover that the problem is when the algorithm follows block five, it spins with the runner alternating between 8 and 9. If you look back up at the dominators tree, you'll see that the immediate dominator of block 8 is block 9, and the idom of block 9 is block 8. Infinite loop.

After re-reading the algorithm and paper a few times to verify that I undersand what's going on, I add a conditional to break out of the statement if we're trying to add a block to the dominance frontiers list of a block that we've already added the current block to.

while (runner >= 0 && runner != unit->idoms[b]) {
    if (set_contains(unit->dominance_frontiers[runner], b))
        /* we've already gone down this path once before */
        runner = 0;
    else
        /* add b to runner's dominance frontier set */
        set_add(unit->dominance_frontiers[runner], b);

    /* runner = idoms[runner] */
    if (runner == 0)
        runner = -1;
    else
        runner = unit->idoms[runner];
}

Everything seemed to work fine, so after getting a review from particle, I committed.