Dealing concurrently with long running / blocking tasks in node.js

Today I spent nearly half of my day digging through solutions for the critic that Ted Dziuba came up with in his article “Node.js is Cancer” from Oct. 1st 2011. There he states that it’s far easier as expected for newbies to write code that blocks node’s single-threaded event-loop. First I thought there might be a simple workaround but as it turns out Ted isn’t so wrong at all – but there are solutions. So, Ted’s critic is based upon a fairly naive (cmon, that’s just an example for a far more complex situation) example. He computes Fibonacci numbers like so:

function fibonacci(n) {
   if (n < 2)
      return 1;
   else
      return fibonacci(n-2) + fibonacci(n-1);
}

If you invoke this piece of code for n>40 you’ll notice that node.js will take quite some time to compute the result (mainly due to the non-cached results in the recursive computation). That’s not a shortcoming of JavaScript or V8 – if you write similar code in PHP or Python you’ll also end up waiting for the result (here’s a blog article doing exactly that). The difference is: while PHP on FastCGI or Java on an application server are making use of a pool of threads that utilize all CPU resources you have node.js relies on a single threaded execution model (the so called “event loop”). That means that a request that invokes the code above is blocking your node server completely since it the event loop waits until the method returns. To overcome this issue (let’s call it a feature) you usually provide a callback function that is called back after the function has finished its lengthy operation  and continues to write on the output stream. The main execution loop then is not interfered at all.

first idea (not working)

So my first idea (which turned out not to work, so read on if you’re looking for solutions) was to use one of the asynchonous libraries like flow, async, step and then I stumbled upon Q. I read a little about the concept of future return values, the so called promises. This concept wraps your long running code into a function that returns a promise object on the future value. This object comes with certain methods that substitute the callback principle. Here’s a small code change that I did with it:

var http = require ('http')
    ,Q = require ('q');

var server = http.createServer(function (req, res) {
	res.writeHead(200, {'Content-Type': 'text/plain'});
	var promise = Q.call(fibonacci, null, 80);
	promise.then(function(fbRes) {
		res.end("Result:" + fbRes);
	});
}).listen(1337, "127.0.0.1");

The Q library defers the call and returns the promise. We bind an anonymous callback using its then method. But as a simple test with ab shows this is still blocking other requests going to that node.js instance.

second idea – use child processes or the cluster module

Obviously the team around node.js is aware of this “issue”. Read carefully the About-article in the node.js master documentation. In the end it says:

But what about multiple-processor concurrency? Aren’t threads necessary to scale programs to multi-core computers? You can start new processes via child_process.fork() these other processes will be scheduled in parallel. For load balancing incoming connections across multiple processes use the cluster module

So I had a look at those. As it turns out, node’s cluster module is marked as highly experimental so I’d rather not make use of it at all. The child module doesn’t look exactly easy to understand as well so I started Googling again. One of the first hits led me to Sitepen that accounts for a rather promising multi-node module. Unfortunately these guys relied on node’s cluster module. So this module simply fails to execute on a node.js > 0.6. Another company finally found a way but this time it seems to be commercial (but there’s an open source license available): The Fabric Engine.  It compiles Javascript code to the native environment so it can be utilized on the server as well as on the client. Sounds interesting for gaming or highly scientific applications but leads a little too far when it should only solve our Fibonacci cancer theorem.  So maybe we find something in node’s own toolbox? Here comes an interesting StackOverflow article that tries to go the remote VM way which basically spawns another V8 (+10M overhead) that executes the code in another (sandboxed) process. That sounds good but it’s not too easy to hand over the return value from the spawned process back to the handling process. The idea pointed out in the article is to let our fibonacci method write to the output stream and let our master process read from it in a non-blocking manner. When Mr Fibonacci finally writes its result to the stdout we can finally hand the value (which then comes in as String) over to the still open response object and finish it.

Finally: solutions

Good news is: there are solutions. Bad news is: they still don’t come easy. One article you’ll most likely will stumble upon when googling for concurrent Javascript will be the one from Bruno that first explains how the heavily discussed Fibers module tries to address the concurrency problem. The shortcoming of Fibers is that it’s delivered as C++ module which will make assumptions on the underlying OS. Maybe a good solution for homogeneous environments but… you know there are still Windows boxes in the wild. In the end Bruno starts talking about the rather new Threads a GoGo library which natively utilizes threading inside the V8 engine (and therefore relies on a newer release of node.js). One important thing to notice (that Bruno points out, too) is that all threads spawned through this approach are running in their own environment so you won’t have many chances to hand over state between them. Here’s a piece of code that runs asynchronously:

var TAGG = require('threads_a_gogo');

// our CPU intensive function
function fibo(n) {
  return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}

// create a worker thread
var t = TAGG.create();
t.eval("fibo(30)", function(err, result) {
  console.log("fibo(30)=" + result);
});

Bleeding edge approach: web workers

Since some months the WHATWG is  finalizing the definition of web workers that should act as threading model for client side code but of course can run on the server side, too. Here’s a pointer to a library which uses that approach. The main advantage is the standardized way of message and event handling. In TAGG the long running  function returns its result as string value which we can utilize. The web worker approach makes communication between the thread compartments easier by introducing a messaging protocol.

The nextTick solution

Last but not least our friend Prash has refactored the original fibonacci function to be handled in slices which allows it to bind against the event loop ticks. That way it can return CPU resources to the main execution loop. I didn’t test his solution on my own yet but I think this works well without adding any native code or weird library approaches. But let’s face it: the resulting code is a bloated mess and the async callback inside the computation recursion mixes up concerns – after all we only want to compute Fibonacci(40). So IMHO this just deals as an example of how you can decompose computation intensive tasks into asynchronous slices if you know what you’re doing. Prash’s solution also is a good example that in node and Javascript things have to be thought differently: you have to think in functions, instances and callbacks rather than in templated interfaces, classes and listeners.

Conclusion

After all that research I have to admit: node.js really has its shortcomings when you want to execute CPU intensive computations. It definitely is not prepared yet for unobtrusive usage of multiprocessor architectures even though its child, VM and cluster features provide built-in solutions. The most promising solutions are Threads a GoGo and Web workers. Fibers on the other hand seem quite an overkill to me and fully depend on an UNIX alike OS.

One thing  that I definitely want to make clear: this shortcoming is not affecting computation intense database operations. Most of the libraries that I found and that I’m using rely on the non-blocking asynchronous callback mechanisms that make node.js so interesting for highly loaded environments. That means: even if your unindexed MySQL query needs 20 seconds to return node.js will be able to serve other requests in the meantime because the MySQL “driver” will hand over control to the node event-loop  as soon as you fired your query. It simply will call back the open response handler as soon as a result set is available. I also want to point out that node.js encourages you to shift your development paradigm to the client side. You usually will only deliver a basic HTML5 site layout to the client. Afterwards you issue AJAX/J queries to your  node.js backend and render everything on the client side again. Node.js mainly deals as handling layer for incoming events and therefore is not 100% comparable to other execution environments. A Java application server is not imposing the aforementioned threading problems but it will get stuck on the Fibonacci computation, too, as soon as its thread pool is exhausted.

So don’t use node.js the same way you’re using your favorite web technology. Otherwise you will most certainly run into unexpected problems. And no, don’t use node.js for your scientific calculations. If you do and fail doing so, at least don’t call it cancer.

Advertisements

3 thoughts on “Dealing concurrently with long running / blocking tasks in node.js

  1. Thanks for this article Stefan, it’s quite instructive.

    A better solution would be to get threads through the threadpool instead of “Threads a gogo” which use “classic” thread…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s