Archive for January, 2012

PLPV

This is the first of a series of posts based on experiences at POPL (Principles of Programming Languages), 2012, in Philadelphia PA.

PLPV (Programming Languages meet Program Verification) is a workshop co-located with POPL, that focuses on ways to reduce the burden of program verification by exploiting properties of the underlying programming language (e.g., types, contracts).

Alan Jeffrey (Bell Labs) gave the first talk of the morning, digging into an intriguing connection between linear temporal logic (LTL) and functional reactive programming (FRP) via the Curry-Howard isomorphism. The basic idea is that because signals in FRP are time-indexed values, their types should be time-indexed as well: but time-indexed types can be read as LTL formulas, following the “types as propositions” slogan. One interesting observation was that the “until” modality in LTL does not have a natural programming interpretation, but its dual, the “constrains” modality, does. (Recall that, for any formulae A, B: A “constrains” B holds at time S whenever, for all times T >= S, if A holds in the interval [S,T) then B holds at T.) In particular, whereas the “constrains” modality forms the basis of compositional reasoning about rely/guarantee properties of concurrent systems, it is reflected in programming as (decoupled) causal functions on signals: functions whose outputs at time T depend on the history of inputs up to (but not including) T. Alan used these insights to show that FRP primitives such as switching and looping can be encoded in a fairly direct manner via LTL types. In the ensuing Q/A session, it was suggested that “until” may be interpreted in programming as “progress,” and that decidability results for LTL could be leveraged for type checking FRP. Overall, super exciting stuff: this will require some further meditation to internalize and apply!

Over a coffee break, I had a nice conversation with Chris Hawblitzel (MSR Redmond) about our ongoing work on ActionScript, and he was quite happy about the potential for impact in this work. We shared our experiences working with product teams at Adobe and Microsoft.

Next, Benjamin Pierce (UPenn) gave a keynote on CRASH/SAFE, a new collaborative project on building a computer with information-flow safety built in at all levels: hardware, operating system, programming language. In particular, Benjamin went over some interesting design tidbits that showcased what can be conceived and achieved when all legacy constraints are removed: the power and challenges of applying the state-of-the-art with a clean slate.

Over lunch, I was pleasantly surprised to find my undergrad classmate, Akash Lal (who is now at MSR India, and works on concurrency); and I had a long conversation on language design work with Ravi Chugh (who has recently joined Mozilla, and works on type systems). We were joined by Pat Rondon (UCSD, who is the principal driver of the very cool “liquid types” line of work) and Jeff Vaughn (UCLA, who has been working on the now much larger Android security project that we started at UMD).

I skipped the session immediately after lunch to catch up on emails (and some sleep, which, for some reason, always eludes me the first night of a hotel stay).

The last session of the day featured another cool talk on “dependent interoperability”: the basic idea being that it should be possible to mix programs in simply-typed and dependently-typed sublanguages by mediating value-passing at the boundaries, via “boundary terms” that translate values up and down as required, and whose types can be derived by looking at the corresponding constructors for the data types of those values. For example, a library for dependently typed lists would require passing lengths around, and it should be possible to have a simply-typed client use this library by managing lengths on the side. Overall, this seems to mirror hybrid / gradual typing, but is sufficiently intriguing to deserve a deeper look.

POPL begins tomorrow! Eagerly looking forward to a lot of learning throughout the rest of the week. (Our type inference paper will be presented on Friday.)

The V8 Myth: Why JavaScript is not a Worthy Competitor

Disclaimer: The following opinion is personal; it does not in any way reflect the opinion of Adobe.

UPDATE (01/18/2012): There have been some misunderstandings around the purpose of this post. The purpose is not to undermine JavaScript; it is to simply point out why ActionScript can do better given the compilation/execution model it inherently has.

UPDATE (01/18/2012): Some benchmark numbers posted below. More to come.

If you found yourself gaping in shocking disbelief at the premise in the title of this post, congratulations! You are the target audience for this post.

JavaScript programs are untyped, (relatively) small programs that are shipped/loaded as source code, and then compiled and run on the fly. In contrast, ActionScript programs are typed, (relatively) large programs that are compiled to byte code, shipped/loaded as byte code, and then run on the fly.

There is something inherently wrong in an argument that is based on the claim that JavaScript can do all its heavy-lifting work post-load time, and still do as good as, if not better than, a language that has the opportunity to do much of that heavy-lifting work pre-load time. What is wrong in such an argument is the reverence of, the dependence on, and the submission to, magic. Unfortunately, sooner or later, we all get to hear that Santa does not exist: the question is, can we grow up sooner?

Dynamic analysis is a great complement to static analysis: unfortunately, it is not a replacement. An ActionScript program that has been optimized to death by an AOT compiler can, almost trivially, beat a JavaScript program that is optimized on the fly by the JIT compiler. The only way out would be to let the JIT compiler work till death, but that is not an option! Checkmate.

So yes, you will hear about all the great things in V8 (JavaScript VM), including type inference. The fact is, there is no way a JIT compiler can afford to do the kind of modular analysis that an algorithm implemented in an AOT compiler can. To prove this unbelievably obnoxious claim, I compared the performance of the untyped JavaScript benchmarks in the Tamarin (ActionScript VM) performance suite, by passing them through, on the one hand, V8, and on the other hand, the type inference algorithm we recently prototyped in Falcon (ActionScript compiler), followed by Tamarin.

The ActionScript toolchain beat the JavaScript toolchain by around 3x.

(Note that a side-to-side comparison was never possible before we implemented type inference in Falcon; effectively, we regenerated the fully typed versions of the untyped JavaScript benchmarks automatically, and let the performance benefits enjoyed by typed ActionScript programs carry over to untyped ActionScript programs.)

So, let us stop worrying about JavaScript, and aim higher. As we focus on gaming, ActionScript programs will require significantly better optimizations for performance. ActionScript has just the right DNA mix for success, and it will become the 21st century language it could always be.

Sample numbers for Tamarin’s test/performance/jsbench benchmarks (running time in seconds for maximum datasizes in each case): 

  • SOR: 23.1 (V8), 4.6 (Falcon + type inference + AVM)
  • LUFact: 138.7 (V8),  23.3 (Falcon + type inference + AVM)
  • HeapSort: 10.0 (V8),  14.7 (Falcon + type inference + AVM)
  • FFT: 75.6 (V8),  31.1 (Falcon + type inference + AVM)
  • Crypt: 25.1 (V8),  7.4 (Falcon + type inference + AVM)

 

Type Inference for ActionScript

A distinguishing feature of ActionScript is that its type system combines static and dynamic typing: an idea that is popularly known as “gradual typing.” Gradual typing is a fancy name for a rather simple, familiar idea. The idea is to facilitate the gradual evolution of “scripts” to “programs,” where scripts describe dynamically typed, volatile code written for prototyping and testing, and programs describe statically typed, mature code written for scaling and maintaining. In other words, gradually typed languages encourage the mixing of statically-typed and dynamically-typed code fragments, with dynamic checks and casts enforcing the soundness of typing at the boundaries between those fragments. The key promise of a gradually typed language is that the statically typed fragments continue to benefit from performance optimizations and other guarantees typically enjoyed by programs in statically typed languages, while still being able to interact with the dynamically typed fragments. Unfortunately, while ActionScript goes part of the way towards gradual typing bliss, it falls short of keeping any such promise.

For example, consider the following code:

function f():Number { ... }
var x = f(); // var x:* = f()
var y:Number = x++;

Today, this code runs as follows (well, almost: there are some tricks played in the JIT, but we will revisit their limitations in another post). The result of f(), typed Number, is passed into the dynamically typed variable x. (The missing type for x is interpreted as the dynamic type, *: something that we will revisit below.) Since the content of x must match its (dynamic) type, the result is “boxed” at run time: the static type Number is converted to a dynamic type tag, the tag is attached to the result, and the result is (literally) packed in a box to store into x. Next, x is passed into a variable y with static type Number, and is incremented. At this point, the dynamic type tag attached to the content of x is checked to be Number, the boxed result is “unboxed,” the result is stored into y, and the result is incremented and boxed again to store it back into x.

Horrifying, isn’t it?

And now, imagine what so many unnecessary checks and casts do to loops.

To be fair, dynamic types have their uses. No matter how expressive your type system is (think polymorphic types, logical types, algebraic types, dependent types, …) you can never express all invariants of your program with types and still hope to check them statically: laws of physics will haunt you (the “Turing halting problem,” anyway). Dynamic types allow you to get around the limitations of the type system, whatever they may be.

However, dynamic types should be the fallback, not the default! In particular, it is an unnecessary and costly mistake to interpret missing types as dynamic types. Often, it is (relatively) easy for the compiler to infer that a missing type is in fact a static type. For example, the code above can be rewritten automatically to the following code, involving no dynamic checks and casts:

function f():Number { ... }
var x:Number = f(); // var x:* = f()
var y:Number = x++;

This code runs much faster!

(Of course, “real-world” examples can be far more complicated than the above example, but the point remains.)

Bottom line, it is not the job of the programmer to tell the compiler whether some code fragment can be typed statically or dynamically. At best, it is an unnecessary hassle; at worst, it is an impediment to optimizations.

So, what can we do to fix this situation?

Last summer, we put our heads together to think hard, and by mid-summer, we had come up with a revolutionary type inference algorithm for ActionScript. (This research will be published in this year’s POPL; check out the paper!) The algorithm relies on a modular analysis of data flows through source code, where the data may (of course) include functions and objects. The distinguishing feature of the algorithm is that it is provably backward-compatible: existing programs can be aggressively optimized by type inference without the slightest fear of breaking their run-time behaviors! (To guarantee this property, the algorithm relies on several technical novelties, including “kind-based closure computation” and “type-directed escape analysis”: the details can be found in the paper.) Furthermore, the algorithm is provably efficient, which means it can not only be implemented in the ActionScript compiler, but also made available in the ActionScript IDE (Flash Builder)!

Over Christmas, I implemented a prototype of our type inference algorithm in Falcon, the next-generation compiler for ActionScript, and ran our dynamically typed benchmarks through the pipeline. Our algorithm could automatically recover static types for those benchmarks, giving us around 2x-5x speedups without even touching the VM! This is, of course, only the beginning: byte code optimizations, language improvements, and a lot more will continue to bump up performance in the coming year, and I will discuss some of those ideas in later posts. It also makes sense to investigate adapting and integrating our algorithm in the JIT, so that opportunities for performance optimizations are found not only in recompiled source code, but also in existing byte code.

These are exciting times.