Tuesday, May 3, 2011

MSVC "enable minimal rebuild" : triumph and tragedy

The Enable Minimal Rebuild (/Gm) feature of the Microsoft C++ compiler (MSVC) is really great. When enabled, the compiler is actually able to determine whether a significant code change has occurred, and selectively re-compiles only the functions that have changed. For example, if you only change a comment, the compiler will happily print out "Skipping... (no relevant changes detected)". This is certainly a triumph!

So what's the tragedy? All the documentation states that minimal rebuild cannot be compiled with parallel builds -- namely the /MP switch, which causes cl.exe to spawn itself multiple times. Apparently we must choose between single files being quick to compile, versus many files being quick to compile. However, you can have both, as I will demonstrate below.

To further explain, the bulk of the documentation is geared towards users of the Visual Studio project systems. Unfortunately, the Visual Studio project systems (*.vcproj and *.vcxproj) only support any kind of parallel C++ builds with /MP. (this is because neither vcproj nor MSBuild vcxproj support file-level dependencies)
The /MP switch is of course nonsense in the context of a real dependency-based build system like make, bjam, Scons, or QRBuild. A build system that provides a global solution for parallelism always works at the finest granularity, and utilizes all processors effectively. Local solutions like /MP will end up achieving much worse performance as the problem scale increases (like when building many MSBuild projects with project-level parallelism enabled). "Tuning C++ build parallelism in VS2010" demonstrates this very well in the sub-section "Too much of a good thing", where the author shows each MSBuild project launching NumCores cl.exe processes, which overwhelms the machine.

The root of the problem is that by default, cl.exe creates a single vcx0.pdb and vcx0.idb across all source files (where x is the msvc version -- for example VS2008 compilers produces vc90.pdb and vc90.idb). In order to arbitrate across multiple cl.exe processes in a parallel build, MSVC launches a server process called mspdbsrv.exe to synchronize access to a common pdb file.
Notice that there is no equivalent msidbsrv.exe for synchronizing access to a common idb file. I suspect this is the core reason why /Gm is unsupported for parallel builds.

What can we do to get parallelism and minimal rebuilds? The trick is to compile each source file independently, and force each file to generate a separate pdb and idb file. Then there is no issue of parallel compilations stepping on each others' idb files.
cl.exe accepts the /Fd switch to select the pdb name per compilation. The pdb name also implies the idb name, so this is sufficient to solve the problem.
QRBuild successfully employs this trick.

caveats:
Apparently YMMV with regard to compilation speed with /Gm.
  1. Slowness when compiling boost.
  2. Possibility of slower link times, as this FAQ implies.

1 comment:

  1. VS 2005 does not want to compile with different pdb files, when precompiled headers are used.
    error C2858: command-line option 'program database name 'aaa' inconsistent with precompiled header, which used 'bbb'

    ReplyDelete