Why does Lancet shows unconnected graphs for some control flow graphs?

By admin, 27 April, 2018

When Lancet displays the control flow graph of a function, it sometimes occurs that the graph is not connected.
There are two possible reasons for this:

The most obvious reason is that there are in fact basic blocks with no incoming edges. When performing unreachable code elimination, these blocks should disappear.
The control flow graph contains an interprocedural edge that is not a CALL edge. An example of this is given in the figure below. In the case of regular function calls, we draw a red arrow from the call site to the called function (yellow box below) and a blue arrow from the called function (yellow box) to the return site. In this case, control flow is not so clean, because when function _IO_vfprintf_5 is entered via basic block 0x804f915 and it returns, control flow will return to one of the basic blocks following the call site of the shown function and NOT a call site of _IO_vfprintf_5. This is modelled with a compensating edge (darkblue below) that connects the return block of _IO_vfprintf_5 with the return block of the depicted function.

Escaping edge (i.e. edges that jump or fall through to another function) occur quite often in library code. In diablo, we cannot easily determine to which functions a basic block actually belongs and we use some algorithm to delimit the boundary of functions. This can introduce some extra escaping edges. To make it worse, there is another algorithm that turns all functions found by the previous step into 'single entry' functions, where there is only one basic block with forward incoming interprocedural edges. This algorithm introduces even more escaping edges. It also splits some functions into several parts. Those parts get the name of the function they originally belonged to, but suffixed with a number (like in this case _IO_vfprintf_5).