Why do I need to patch my toolchain/use a different toolchain?

Submitted by admin on Fri, 04/27/2018 - 12:16

Reconstruction of an accurate control flow graph from machine code, even at link time, is in fact an undecidable problem. For a quick overview of the difficulties involved, see the FAQ item "How do you reconstruct the control flow graph from the object files at link time?".

To make CFG construction more reliable, we apply some patches to the tool chain:

  1. In order to let Diablo differentiate easily between
    instructions and embedded data in the code section, we emit
    markers that indicate data in the code section.
  2. The GNU assembler uses a technique called symbol relaxing to
    speed up the linking process. Unfortunately, a lot of information
    about the relocations that is important to Diablo (but not to a
    regular linker) is lost when applying this technique. Our patch
    disables this relaxing.
  3. With a small patch to the GCC specs file (which unfortunately
    introduces a dependency on Perl), we make the compiler emit
    markers for inline assembly code. These markers help Diablo in
    judging whether or not its optimizations may rely on calling
    conventions for a particular piece of code.
  4. We also patch ld/libbfd to turn off string table compaction.
    There is no fundamental need to do this, we just haven't yet found the
    time to support this feature in Diablo.