Posts tagged "Type Inference"

The V8 Myth: Why JavaScript is not a Worthy Competitor

Disclaimer: The following opinion is personal; it does not in any way reflect the opinion of Adobe.

UPDATE (01/18/2012): There have been some misunderstandings around the purpose of this post. The purpose is not to undermine JavaScript; it is to simply point out why ActionScript can do better given the compilation/execution model it inherently has.

UPDATE (01/18/2012): Some benchmark numbers posted below. More to come.

If you found yourself gaping in shocking disbelief at the premise in the title of this post, congratulations! You are the target audience for this post.

JavaScript programs are untyped, (relatively) small programs that are shipped/loaded as source code, and then compiled and run on the fly. In contrast, ActionScript programs are typed, (relatively) large programs that are compiled to byte code, shipped/loaded as byte code, and then run on the fly.

There is something inherently wrong in an argument that is based on the claim that JavaScript can do all its heavy-lifting work post-load time, and still do as good as, if not better than, a language that has the opportunity to do much of that heavy-lifting work pre-load time. What is wrong in such an argument is the reverence of, the dependence on, and the submission to, magic. Unfortunately, sooner or later, we all get to hear that Santa does not exist: the question is, can we grow up sooner?

Dynamic analysis is a great complement to static analysis: unfortunately, it is not a replacement. An ActionScript program that has been optimized to death by an AOT compiler can, almost trivially, beat a JavaScript program that is optimized on the fly by the JIT compiler. The only way out would be to let the JIT compiler work till death, but that is not an option! Checkmate.

So yes, you will hear about all the great things in V8 (JavaScript VM), including type inference. The fact is, there is no way a JIT compiler can afford to do the kind of modular analysis that an algorithm implemented in an AOT compiler can. To prove this unbelievably obnoxious claim, I compared the performance of the untyped JavaScript benchmarks in the Tamarin (ActionScript VM) performance suite, by passing them through, on the one hand, V8, and on the other hand, the type inference algorithm we recently prototyped in Falcon (ActionScript compiler), followed by Tamarin.

The ActionScript toolchain beat the JavaScript toolchain by around 3x.

(Note that a side-to-side comparison was never possible before we implemented type inference in Falcon; effectively, we regenerated the fully typed versions of the untyped JavaScript benchmarks automatically, and let the performance benefits enjoyed by typed ActionScript programs carry over to untyped ActionScript programs.)

So, let us stop worrying about JavaScript, and aim higher. As we focus on gaming, ActionScript programs will require significantly better optimizations for performance. ActionScript has just the right DNA mix for success, and it will become the 21st century language it could always be.

Sample numbers for Tamarin’s test/performance/jsbench benchmarks (running time in seconds for maximum datasizes in each case): 

  • SOR: 23.1 (V8), 4.6 (Falcon + type inference + AVM)
  • LUFact: 138.7 (V8),  23.3 (Falcon + type inference + AVM)
  • HeapSort: 10.0 (V8),  14.7 (Falcon + type inference + AVM)
  • FFT: 75.6 (V8),  31.1 (Falcon + type inference + AVM)
  • Crypt: 25.1 (V8),  7.4 (Falcon + type inference + AVM)

 

Type Inference for ActionScript

A distinguishing feature of ActionScript is that its type system combines static and dynamic typing: an idea that is popularly known as “gradual typing.” Gradual typing is a fancy name for a rather simple, familiar idea. The idea is to facilitate the gradual evolution of “scripts” to “programs,” where scripts describe dynamically typed, volatile code written for prototyping and testing, and programs describe statically typed, mature code written for scaling and maintaining. In other words, gradually typed languages encourage the mixing of statically-typed and dynamically-typed code fragments, with dynamic checks and casts enforcing the soundness of typing at the boundaries between those fragments. The key promise of a gradually typed language is that the statically typed fragments continue to benefit from performance optimizations and other guarantees typically enjoyed by programs in statically typed languages, while still being able to interact with the dynamically typed fragments. Unfortunately, while ActionScript goes part of the way towards gradual typing bliss, it falls short of keeping any such promise.

For example, consider the following code:

function f():Number { ... }
var x = f(); // var x:* = f()
var y:Number = x++;

Today, this code runs as follows (well, almost: there are some tricks played in the JIT, but we will revisit their limitations in another post). The result of f(), typed Number, is passed into the dynamically typed variable x. (The missing type for x is interpreted as the dynamic type, *: something that we will revisit below.) Since the content of x must match its (dynamic) type, the result is “boxed” at run time: the static type Number is converted to a dynamic type tag, the tag is attached to the result, and the result is (literally) packed in a box to store into x. Next, x is passed into a variable y with static type Number, and is incremented. At this point, the dynamic type tag attached to the content of x is checked to be Number, the boxed result is “unboxed,” the result is stored into y, and the result is incremented and boxed again to store it back into x.

Horrifying, isn’t it?

And now, imagine what so many unnecessary checks and casts do to loops.

To be fair, dynamic types have their uses. No matter how expressive your type system is (think polymorphic types, logical types, algebraic types, dependent types, …) you can never express all invariants of your program with types and still hope to check them statically: laws of physics will haunt you (the “Turing halting problem,” anyway). Dynamic types allow you to get around the limitations of the type system, whatever they may be.

However, dynamic types should be the fallback, not the default! In particular, it is an unnecessary and costly mistake to interpret missing types as dynamic types. Often, it is (relatively) easy for the compiler to infer that a missing type is in fact a static type. For example, the code above can be rewritten automatically to the following code, involving no dynamic checks and casts:

function f():Number { ... }
var x:Number = f(); // var x:* = f()
var y:Number = x++;

This code runs much faster!

(Of course, “real-world” examples can be far more complicated than the above example, but the point remains.)

Bottom line, it is not the job of the programmer to tell the compiler whether some code fragment can be typed statically or dynamically. At best, it is an unnecessary hassle; at worst, it is an impediment to optimizations.

So, what can we do to fix this situation?

Last summer, we put our heads together to think hard, and by mid-summer, we had come up with a revolutionary type inference algorithm for ActionScript. (This research will be published in this year’s POPL; check out the paper!) The algorithm relies on a modular analysis of data flows through source code, where the data may (of course) include functions and objects. The distinguishing feature of the algorithm is that it is provably backward-compatible: existing programs can be aggressively optimized by type inference without the slightest fear of breaking their run-time behaviors! (To guarantee this property, the algorithm relies on several technical novelties, including “kind-based closure computation” and “type-directed escape analysis”: the details can be found in the paper.) Furthermore, the algorithm is provably efficient, which means it can not only be implemented in the ActionScript compiler, but also made available in the ActionScript IDE (Flash Builder)!

Over Christmas, I implemented a prototype of our type inference algorithm in Falcon, the next-generation compiler for ActionScript, and ran our dynamically typed benchmarks through the pipeline. Our algorithm could automatically recover static types for those benchmarks, giving us around 2x-5x speedups without even touching the VM! This is, of course, only the beginning: byte code optimizations, language improvements, and a lot more will continue to bump up performance in the coming year, and I will discuss some of those ideas in later posts. It also makes sense to investigate adapting and integrating our algorithm in the JIT, so that opportunities for performance optimizations are found not only in recompiled source code, but also in existing byte code.

These are exciting times.