Today I was reminded of one of my neatest Node.js hacks. A few years ago, in the process of optimizing how Rackspace Cloud Monitoring compiles user-supplied alarms (a javascript-like DSL used to implement thresholds) we discovered that we were spending a significant amount of CPU time in a widely used Javascript implemetation of sprintf. This was back in the dark ages of Node.js, before util.format landed.
The CPU time spent in sprintf wasn’t enough to be a problem: even compiling a few hundred thousand alarms is pretty fast, as compared to reading them out of a database, serializing the compiled alarms to XML, and loading them into Esper. Nonetheless, in a bout of “not invented here” and with a spirit of adventure in my heart, I did the obvious thing, and took a weekend to write a faster sprintf.
“Standard” Sprintf
The standard implementation of sprintf takes a format string, followed by any number of positional arguments intended to be injected into the resulting string. It operates by parsing the format string using a series of regular expressions, to generate a parse tree consisting of alternate constant strings and formating placeholders.
For example, consider:
sprintf('The %s ran around the tree', 'dog');
The generated parse tree looks something like:
['The ', '%s', ' ran around the tree']
Then the tree is is iterated, and positional (or named) arguments injected to generate an array that can be joined into the appropriate result:
return ['The ', 'dog', ' ran around the tree'].join('');
As an optimization, the parse tree is cached for each format string, so that repeated calls to sprintf for a given format string need only repeat the actual argument injection.
Getting Wild
TLDR; the code
So how can this be further optimized? We know a few things about V8:
- V8 is very good at concatenating strings.
- V8 is very good at just-in-time compiling “hot” functions.
- At least as of Crankshaft (the latest version of V8 I’ve used in any
seriousness), V8 was unable to optimize code that treated
arguments
in unusual ways such as iterating it, or mixing its use with named arguments.
I was able to take advantage of these properties by generating a function which applied the format string through a single-line string concatenation, instead of instead of generating a parse tree. Taking the example above, I generate a string such as:
var fnBody = "return 'The ' + arguments[1] + ' jumped over the tree';";
Then compiling that string into a function on the fly:
return Function(fnBody);
By caching the resulting Function
object, I was able to cause V8’s JIT to optimize calls to sprintf into little more than a dictionary lookup, a function call and a string concatenation.
Security
An obvious risk of this strategy is that an attacker might find a way to cause us to generate arbitrary javascript.
This can be mitigated by never passing user-supplied input as a format string. In fact, because the cache doesn’t implement any expiration, you should probably only ever pass literal format strings or you’ll end up with a memory leak. This seems to be true of node-sprintf as well, so I don’t consider it a serious limitation, just something to be aware of.
Performance
At the time, we saw marked (if not especially necessary) speedups in alarm compilation performance, but I don’t have the bencharks on-hand. Instead, on a modern-ish version of Node.js (v0.10.17) running on my Macbook Pro I tested:
- My “fast” sprintf
- Node’s util.format
- The widely used sprintf module
The test was:
for (var i = 0; i < 10000000; i++) {
sprintf_fn('The %s jumped over a tree', i);
}
The results:
Implementation | Time |
---|---|
fast sprintf | 1504ms |
util.format | 14761ms |
standard sprintf | 22964ms |
The improved sprintf lacks a lot of the functionality of the other implementations, so the comparison isn’t entirely fair. Nonetheless, with a speedup of about 10x over util.format and 15x over sprintf (at least for this benchmark), I think its safe to declare this hack a success.