FlasCC and Link Time Optimization

The FlasCC toolchain is closely modeled after a typical native C/C++ development toolchain. It contains all of the tools you would expect from such a toolchain, including a compiler (with preprocesser), an assembler, a linker, an nm symbol lister, an ar archiver, etc.

As with many such toolchains, FlasCC uses the preprocess, compile, assemble, link model. Each is an individual step, although the compiler driver may perform multiple steps with a single invocation.

In the following example, the compiler driver is used to go from a simple .c file to a final executable in a single command:

/* everyone's favorite C program: hello.c */
int main()
{
  printf("Hello, world!\n");
  return 0;
}


# preprocess, compile, assemble, link, resulting in final executable
> flascc/sdk/usr/bin/gcc hello.c –o hello.exe

Under the hood, it’s going through 4 distinct stages.

Stage 1: Preprocessing

The preprocessing stage works the same way with FlasCC as with a conventional native toolchain. It processes #include directives, expands macros, etc.

# preprocess hello.c resulting in preprocessed C in hello.i
> flascc/sdk/usr/bin/gcc –E hello.c –o hello.i

Stage 2: Compilation

The compilation stage compiles the preprocessed C code into “assembly code.” In a native toolchain, this would result in x86, arm, etc. assembly code. For FlasCC, “assembly code” is ActionScript 3 (AS3) code.

# compile hello.i resulting in “assembly code” in hello.s
> flascc/sdk/usr/bin/gcc –S hello.i –o hello.s

Stage 3: Assembly

The assembly stage “assembles” the ActionScript 3 code generated in the previous stage into “object code.” In actuality, it is doing a full compile of the AS3 code using the ASC2 compiler, resulting in an ActionScript Bytecode (or ABC) file. There is some additional work done to compile the single file into multiple AS3 “scripts” to allow lazy initialization of FlasCC modules at runtime. While the resulting file is actually ABC, the nm and ar utilities have been extended to understand them.

# assemble hello.s resulting in object code in hello.o
> flascc/sdk/usr/bin/gcc –c hello.s –o hello.o

The nm tool can be used to list symbols of an ABC “object file.”

# list C symbols in hello.o
> flascc/sdk/usr/bin/nm hello.o
00000000 T _main
U _puts
00000000 W abort
00000000 W memcpy
00000000 W memmove
00000000 W memset

The ar tool can be used to build an archive containing ABC “object files”

# create an archive containing hello.o
> flascc/sdk/usr/bin/ar cr hello.a hello.o

# list the C symbols in the hello.a archive
> flascc/sdk/usr/bin/nm hello.a

hello.o:
00000000 T _main
U _puts
00000000 W abort
00000000 W memcpy
00000000 W memmove
00000000 W memset

Stage 4: Link

Finally, we can link hello.o into a standalone executable.

# link hello.o into an executable
> flascc/sdk/usr/bin/gcc hello.o –o hello.exe

# execute the exe
> ./hello.exe
Hello, world!

This blog post describes why you might want a standalone executable built using FlasCC. But in most cases, you would want to build a SWF instead.

# link hello.o into a SWF
> flascc/sdk/usr/bin/gcc hello.o –emit-swf –o hello.swf

Link Time Optimization (LTO)

FlasCC is built on the LLVM Compiler Infrastructure and is therefore able to take advantage of LLVM’s powerful link time optimization features. LTO can, in many cases, result in substantially faster and smaller SWFs. LTO is enabled on a file-by-file basis by passing the –flto flag to the compiler driver for each source file that is to participate.

Using the –flto flag (or the –O4 optimization level which implies –flto) when compiling source into object code results in an LLVM bitcode file instead of an ABC.

# compile hello.c into LLVM bitcode
> flascc/sdk/usr/bin/gcc hello.c –c –flto –o hello.o

The relevant tools can consume both ABC and LLVM bitcode files.

# nm still works
> flascc/sdk/usr/bin/nm hello.o
00000000 T _main
U _puts
00000000 W abort
00000000 W memcpy
00000000 W memmove
00000000 W memset

# ar still works
> flascc/sdk/usr/bin/ar cr hello.a hello.o

# contents of archives still accessible
> flascc/sdk/usr/bin/nm hello.a
hello.o:
00000000 T _main
U _puts
00000000 W abort
00000000 W memcpy
00000000 W memmove
00000000 W memset

# build a LTO-ed version of hello.exe
> flascc/sdk/usr/bin/gcc –flto hello.o –o hello.exe

FlasCC comes with both ABC and LLVM bitcode flavored standard libraries. So a fully LTO build (–flto or –O4 passed to the compiler driver for each source file and for the link step) will do link time optimization of the standard C and C++ libraries as well as object files generated from user code.

Differences Between LTO and non-LTO Builds in FlasCC

One big difference between LTO and non-LTO builds in FlasCC is where AS3 generation and compilation happens. In a non-LTO build, AS3 is generated and compiled to ABC as each source file is compiled. Subsequently, the generated ABCs are combined in the link step to create a SWF or executable. Compared to compiling AS3, combining ABCs is a relatively inexpensive operation. So compiling each source file is relatively expensive, but linking is relatively cheap. Changing a single source file in a project means recompiling that source file and relinking.

In an LTO build, each source file is compiled to LLVM bitcode. In the link step, the LLVM bitcode files generated from source are linked together into a single aggregate bitcode file. That bitcode file can then be optimized. Since bitcode from various source files have been combined into this single unit, optimizations can be applied across functions and data from multiple source files.

After optimization, the bitcode is then used to generate a single AS3 file that is then compiled with ASC2. This results in compile time characteristics that are the opposite of non-LTO builds. Here, compilation of source is cheap and linking in expensive. This also means that, since the generated AS3 depends on the aggregate LLVM bitcode file, and the aggregate LLVM bitcode file depends on each LLVM bitcode file derived from source (as well as bitcode for the C/C++ standard libraries in a fully LTO build), all AS3 must be regenerated and recompiled every time any source file in a project is changed!

So, as a general rule, it makes sense to use LTO for release-type builds because the longer compile time will result in smaller/faster SWF files. On the other hand if you are creating debug-type builds it makes sense to avoid LTO for faster compile times making it easier to iterate on the code.

Build Type Compile Performance SWF Performance SWF File Size
LTO Slower Faster Smaller
Non-LTO Faster Slower Larger

A bit more on this topic can be found in the reference guide in the Optimizing SWFs and SWCs subsection of the Compiling With GCC section.