Type Inference for ActionScript

A distinguishing feature of ActionScript is that its type system combines static and dynamic typing: an idea that is popularly known as “gradual typing.” Gradual typing is a fancy name for a rather simple, familiar idea. The idea is to facilitate the gradual evolution of “scripts” to “programs,” where scripts describe dynamically typed, volatile code written for prototyping and testing, and programs describe statically typed, mature code written for scaling and maintaining. In other words, gradually typed languages encourage the mixing of statically-typed and dynamically-typed code fragments, with dynamic checks and casts enforcing the soundness of typing at the boundaries between those fragments. The key promise of a gradually typed language is that the statically typed fragments continue to benefit from performance optimizations and other guarantees typically enjoyed by programs in statically typed languages, while still being able to interact with the dynamically typed fragments. Unfortunately, while ActionScript goes part of the way towards gradual typing bliss, it falls short of keeping any such promise.

For example, consider the following code:

function f():Number { ... }
var x = f(); // var x:* = f()
var y:Number = x++;

Today, this code runs as follows (well, almost: there are some tricks played in the JIT, but we will revisit their limitations in another post). The result of f(), typed Number, is passed into the dynamically typed variable x. (The missing type for x is interpreted as the dynamic type, *: something that we will revisit below.) Since the content of x must match its (dynamic) type, the result is “boxed” at run time: the static type Number is converted to a dynamic type tag, the tag is attached to the result, and the result is (literally) packed in a box to store into x. Next, x is passed into a variable y with static type Number, and is incremented. At this point, the dynamic type tag attached to the content of x is checked to be Number, the boxed result is “unboxed,” the result is stored into y, and the result is incremented and boxed again to store it back into x.

Horrifying, isn’t it?

And now, imagine what so many unnecessary checks and casts do to loops.

To be fair, dynamic types have their uses. No matter how expressive your type system is (think polymorphic types, logical types, algebraic types, dependent types, …) you can never express all invariants of your program with types and still hope to check them statically: laws of physics will haunt you (the “Turing halting problem,” anyway). Dynamic types allow you to get around the limitations of the type system, whatever they may be.

However, dynamic types should be the fallback, not the default! In particular, it is an unnecessary and costly mistake to interpret missing types as dynamic types. Often, it is (relatively) easy for the compiler to infer that a missing type is in fact a static type. For example, the code above can be rewritten automatically to the following code, involving no dynamic checks and casts:

function f():Number { ... }
var x:Number = f(); // var x:* = f()
var y:Number = x++;

This code runs much faster!

(Of course, “real-world” examples can be far more complicated than the above example, but the point remains.)

Bottom line, it is not the job of the programmer to tell the compiler whether some code fragment can be typed statically or dynamically. At best, it is an unnecessary hassle; at worst, it is an impediment to optimizations.

So, what can we do to fix this situation?

Last summer, we put our heads together to think hard, and by mid-summer, we had come up with a revolutionary type inference algorithm for ActionScript. (This research will be published in this year’s POPL; check out the paper!) The algorithm relies on a modular analysis of data flows through source code, where the data may (of course) include functions and objects. The distinguishing feature of the algorithm is that it is provably backward-compatible: existing programs can be aggressively optimized by type inference without the slightest fear of breaking their run-time behaviors! (To guarantee this property, the algorithm relies on several technical novelties, including “kind-based closure computation” and “type-directed escape analysis”: the details can be found in the paper.) Furthermore, the algorithm is provably efficient, which means it can not only be implemented in the ActionScript compiler, but also made available in the ActionScript IDE (Flash Builder)!

Over Christmas, I implemented a prototype of our type inference algorithm in Falcon, the next-generation compiler for ActionScript, and ran our dynamically typed benchmarks through the pipeline. Our algorithm could automatically recover static types for those benchmarks, giving us around 2x-5x speedups without even touching the VM! This is, of course, only the beginning: byte code optimizations, language improvements, and a lot more will continue to bump up performance in the coming year, and I will discuss some of those ideas in later posts. It also makes sense to investigate adapting and integrating our algorithm in the JIT, so that opportunities for performance optimizations are found not only in recompiled source code, but also in existing byte code.

These are exciting times.