Adobe

decor

Web Platform Team Blog

Making the web awesome

Introduction to the Performance-Tests in WebKit

In this post I would like to give a short overview of WebKit’s performance and memory testing framework. Along with a bunch of WebKit geeks, I have been involved in the development process for a while, mostly from the side of contributions to memory-measurement.

If I were to summarize the evolution of performance tests in WebKit, I’d starting with the early ages when we had JavaScript performance tests only for JavaScriptCore (SunSpider test suite) and we had another set of tests for V8 (V8 test suite). We also had a ton of layout tests, which could give performance feedback, but had purposes other than performance testing. You can say at this point: “Hey, stop! There are benchmark sites and suites all around online!” I totally agree with this, but we needed something more Web-Engine specific, something for testing WebKit itself (including the JavaScript engine and the Web engine) instead of the browser that is built above WebKit.

Let’s distinguish between online benchmarks for browser-level performance (e.g. Chromium performance testing) and the engine-level performance tests. Engine-level performance tests do not test the browser or the underlying platform’s graphical performance, they intend to test a part of a specific component of the web-engine (e.g. parse time of the HTML parser or runtime of the layout-system for floating containers). Browser-level performance tests usually do complex things that most likely try to use more components of the web-engine. Usually the goal is with these tests to reproduce some kind of real browsing behavior to measure the browser’s response time. This post is only about the engine-level testing.

Engine Level Performance Tests have been in WebKit trunk for more than a year now. We improved the system a lot in the last year (e.g. we added support for memory measurements in WebKit Bug #78984), so I think it is time to blog about it to a wider audience. Although work on the system is still in progress, it’s already capable both of demonstrating improvements or catching both performance and memory regressions.

In the first part of this entry, I want to give you a short introduction to the system and a short description how you can use the testing infrastructure. In the second part I intend to talk about the continuous performance measuring system and its online visualizer – the performance-test website. If you follow carefully I’m sure you get some information about our super-exciting future plans!

What is a performance test in WebKit?

 

Our performance tests measure the run-time performance and the memory usage of WebKit. In all the test cases we start the measurement before the concrete test (the function, the animation, the page loading) starts and stop it when it finishes. Although measuring performance and memory sound pretty straightforward, it can be deceptively difficult. For example, what do we mean by run-time performance? The right answer depends on the test itself. The animation performance-tests produce frame per second values (fps). The tests that measure the runtime of different JS functions (DOM manipulation, page loading, etc.) can produce either milliseconds (ms) or runs per second (run/s). If we really want to understand the meaning of a specific set of performance measurements, we need to look deeper at the actual test cases, but that’s not goal of this post. C’mon, this is just an introduction!

The memory consumption tests produce two values. The first one is the general heap usage of WebKit (memory allocated through the FastMalloc interface) and the second is the heap usage of JavaScript (memory allocated by the actual JavaScript engine). We count all of our raw memory results in bytes, but as you will see it later on the result pages, we display them in kilobytes.

Both the performance and the memory results are produced via the JavaScript engine. All of our performance-tests are JavaScript-based except the Webpage Replay Tests (you can read about them on the related page in WebKit trac). We experimented with other approaches like C++ and Python, but the JavaScript one was the most general over the different WebKit ports, so we stayed with it.

How to run performance tests in WebKit?

 

It’s time do some actual experiments! We have a script to run performance tests, it is located under the trunk/Tools/Scripts directory called run-perf-tests. I assume you have a WebKit build (if you don’t have a WebKit build, there is a documentation how to set one up), so to run all performance tests (located under trunk/PerformanceTests) you only need to run that script (different ports require additional parameters, for the details check out the –platform parameter). Because running all the tests can take long time, you can restrict the number of the tests run by specifying some directories or a list of tests as a parameter for run-perf-tests.

The script produces pretty straightforward output:

 

Running Bindings/set-attribute.html (18 of 115)
DESCRIPTION: This benchmark covers 'setAttribute' in Dromaeo/dom-attr.html and
             other DOM methods that return an undefined.
RESULT Bindings: set-attribute= 670.696533834 runs/s
median= 670.967741935 runs/s, stdev= 4.72565174943 runs/s, 
        min= 663.265306122 runs/s, max= 677.966101695 runs/s
RESULT Bindings: set-attribute: JSHeap= 57804.8 bytes
median= 57832.0 bytes, stdev= 1283.97383151 bytes, 
        min= 54112.0 bytes, max= 59296.0 bytes
RESULT Bindings: set-attribute: Malloc= 1574148.0 bytes
median= 1572772.0 bytes, stdev= 3346.56713349 bytes, 
        min= 1568992.0 bytes, max= 1584504.0 bytes
Finished: 16.537399 s

 

The output contains all the things that we are interested in, like test names, descriptions (if they exist) and the performance and memory results. By default after the warm-up run, we run each test 20 times (DumpRenderTree / WebKitTestRunner) to provide a stable result.

After the script has finished the testing, in addition to the console output we get a nice HTML based table with all of our results. The results in the table show the average values and their deviation for each test. If the result is not stable enough we also get a nice sign.

 

results_table

 

It’s possible to switch between the Time and the Memory view and reorder the results. If we are interested in a specific test, we can select it to see more details, as the screenshot below.

 

results_table_detailed

How to compare performance results?


Let’s see a simple workflow: we have a clean WebKit repository with a build. We run all the performance-tests and we want to compare the results with our modified WebKit. Since we store all the performance results by default in the WebKitBuild/Release/PerformanceTestsResults.json file, it’s very simple to compare the results… You just need to apply your patch to the repository, rebuild WebKit and run performance-tests again. After the rerun, the results HTML page contains a new column and shows all the new and old results in the same table. You can repeat the measurement as many times as you wish, each run will add a new column to your results. Additionally, you can use compare different repositories or specify another file for storing the results. You can find additional details if you run the run-perf-tests –help command.

 

results_table_compare

 

This way you can easily test the effect of your changes. Furthermore the results table will inform you about your improvements and regressions in a simple and clear way.

Continuous performance and memory results online

 

Each platform has its own performance bot (build.webkit.org) that provides continuous performance measurements and test results are submitted to ​http://webkit-perf.appspot.com/.

 

perfomatic_front

 

I think the most useful feature of the Perf-O-Matic system is the Custom Charts section. With the help of Custom Charts, we are able to make custom queries to check out, compare, and investigate individual test results that run on the performance bots in the past. This way we can verify improvements or catch regressions on the individual test level. See below for an example test evaluating CSS property setters and getters by setting all the possible CSS properties to a pre-defined value and access properties through JavaScript.

 

perfomatic_custom_test

 

You can find several useful pieces of information in the chart above. We always want to know the name of the test, the port that we have tested, the difference to the previous measurement, the SVN revision and the date of the measurement. All of this information is located on a nice zoomable chart. Take some time to play with the charts, I’m pretty sure that you will also find it useful. We are planning to update this system to a newer version soon, so more useful features are coming!

Looking into the future of performance measurements

 

After we further stabilize the continuous performance testing system (WebKit Bug #77037) and we can identify a well-defined set of tests with low deviation and meaningful results (WebKit Bug #105003), we are planning to add a performance-testing Early Warning System support to Bugzilla. The system will report the possible performance and memory regressions for the uploaded patches automatically in a Bugzilla comment, so it will work just like the build/layout Early Warning System. It will help to improve the quality of our developments in WebKit a lot. It sounds cool, doesn’t it?

I hope you enjoyed this little introduction to the WebKit Performance Test system and its online visualizer, and that you will try some of these tools.

Links

 

Comments are closed.