Node.js


22
Nov 11

Performance benchmarking Socket.io 0.8.7, 0.7.11 and 0.6.17 and Node's native TCP

I’ve been working with Socket.io quite a bit recently. It’s a great library. However, after upgrading to 0.8.x, I ran into problems with increased CPU usage. Since performance is very important for high traffic pubsub implementations, I decided to investigate this further – and try to quantify the performance impact of upgrading to a newer version of Socket.io.

I wrote a benchmarking suite (siobench). The benchmark is rather simple. Clients connect one at a time, and a new client is only allowed to connect when the previous one is connected. When the server has used up 5000 milliseconds of CPU time, the benchmark is stopped. Every second, every connected client sends a single message which is echoed back by the server (more details).

This workload is geared towards a situation where Socket.io is used to notify people of things as part of a larger application: e.g. most of the load is assumed to be idling connections rather than real-time messaging like in, say, a multiplayer game.

The “end of test” condition is 5000 ms of CPU time, because this seemed to be a easy way to give all implementations the same amount of time. CPU usage % is not accurate, since it is dependent on how much CPU time the process gets over a particular amount of wallclock time. In the graphs the CPU usage % calculated over a 100ms interval, while usertime and systime are the actual numbers reported at that particular time.

Summary

Node (0.4.12) using tcp ~ 8000 connections on a single core
socket.io 0.6.17 using websockets ~ 2300 connections on a single core
socket.io 0.7.11 using websockets ~ 1800 connections on a single core
socket.io 0.8.6 using websockets ~ 1900 connections on a single core

Remember, this is just one server on one core, with 5000 ms of CPU time on that core. The rest of the cores are used to generate sufficient load. The full graphs are at the end of the post.

Note that the absolute numbers are mostly unimportant – I ran this on the following 15″ Macbook Pro running Arch with the 3.1.04 Linux kernel in Virtualbox with 4096 Mb of RAM, a SSD and four cores (Intel(R) Core(TM) i7-2635QM CPU @ 2.00GHz GenuineIntel GNU/Linux). You can get numbers that are more representative of your system by getting siobench and running it:

Usage: node siobench.js [env]
A tool for benchmarking your Socket.io server.

Available environments:
	0.6.17
	0.6.17_poll
	0.7.11
	0.8.7
	0.8.7_poll
	tcp

You can also write your own benchmarks under ./bench, by writing a new server.js (example #1, #2) and a new client.js (example #1, #2). Each benchmark has it’s own set of npm dependencies installed, so that one can run benchmarks against many versions of socket.io.

Some notes on performance

The relative performance is more interesting.

First, the node TCP speed represents the highest achievable performance on this benchmark, since it only uses the built-in TCP implementation. Compared to this, Socket.io is has about 1/3 of the performance (~ 2300 vs ~8000 connections) when using WebSockets.

Second, it appears that 0.8.7 is about 20% slower than 0.6.17 on this benchmark. If I remember correctly, Socket.io 0.7 switched to a new protocol, and there are clearly some performance improvements over 0.7.11 in 0.8.7 (+100 connections in this bench); it’s just that the overall performance is still worse in this benchmark than in the old 0.6.17 branch.

Working towards higher-performance

As this is just a simple benchmark, I don’t really have solutions – only some suggestions.

1) A CI build that includes benchmarks and community contributed test cases

First, I’d love to see a CI build for Socket.io that would include performance benchmarks and community contributed test cases.

However, currently setting up a CI build for Socket.io is difficult because the bundled test suite only works on OSX. It would be a lot easier to contribute if the tests worked on other platforms.

I am hoping that as Engine.io gets going, the test suite will be fixed so that it can be run on other platforms. Otherwise, contributing improvements will be tricky/impossible since there is no way to tell whether the code works.

2) More realistic performance test scenarios

The current test scenario is rather limited in that it mostly tests performance in terms of establishing connections (without terminating them). I’d love to hear more realistic scenario suggestions, particularly from people who have run into memory usage issues.

siobench is only a starting point: it’s way better than just looking at htop and wondering whether performance was better in the last version or not. There are still specific questions that should be formulated as replicable tests.

3) A polling transport that works on Node.js

I did write tests for the xhr-polling transport for Socket.io as well. These showed much worse performance, around:

  • ~ 550 connections on Socket.io 0.6.17 (vs ~2300 using WS)
  • ~ 450 connections on Socket.io 0.8.7 (vs ~ 1900 using WS)

However, the xhr-polling is severely broken in that it stops connecting after 4-5 connections on Node v0.4.12. So I had to force each load generating client to only make four connections and then spawn a new load generating process to work around the problem. I wouldn’t vouch for the accuracy of the test with xhr-polling until the xhr-polling transport is fixed on Node when using socket.io-client (it’s been broken for the last three releases, though).

4) Comparative benchmarks

Hopefully, this will help with performance testing new releases of Socket.io and other Comet libraries. Since the plan is that Engine.io will allow people to work with a lower level than Socket.io, there might be new performance oriented versions, and it would be useful to see benchmarks for those. Re: the other Node.js pubsub frameworks: I can’t benchmark Faye, because it does not provide the right API out of the box, and Juggernaut uses Socket.io internally.

I’m going to use siobench it for internal testing to ensure that the pubsub implementation I am working on (built over Socket.io) will not have performance regressions.

The full graphs are below. Please leave comments and suggestions for improvements – I am hoping that the developer community around Socket.io can help in improving the performance going forward, kind of like what Mozilla did with “arewefastyet.com“.

Socket.io 0.6.17 – Websockets – CPU usage and time

 

Socket.io 0.6.17 – Websockets – resident set size


 

Socket.io 0.7.11 – Websockets – CPU usage and time

Socket.io 0.7.11 – Websockets – resident set size

Socket.io 0.8.7 – Websockets – CPU usage and time

Socket.io 0.8.7 – Websockets – resident set size

Node 0.4.12 – TCP – CPU usage and time

Node 0.4.12 – TCP – resident set size

 

 


18
Feb 11

Cascading file loading in Node.js

One of my favorite features of Kohana 3 is it’s cascading filesystem – so I decided to implement it for Node.js. A cascading filesystem is an elegant solution to a common problem: how to provide a mechanism for loading modules and reusing code?

The following image from Kohana 3′s docs shows an example:

Benefits

The key benefits are:

  1. Consistency. All your application files, including views, controllers, models and other data such as translation messages are loaded using one, easy-to-understand mechanism.
  2. Easy reuse. Without a cascading file system, you’ll have to copy and move files around if you want to use someone else’s libraries or modules. With a cascading file system, you just place the module in your application, and enable cascading for that directory.
  3. Transparent extensibility. What if you want to override one part of a module (say, a view) but don’t want to modify your copy of the module (e.g. so that you can update without manually merging changes). A cascading filesystem allows you to selectively replace files in 3rd party code simply by providing your own version of the file.

The code

Load order and file name resolution

The load order for my implementation is:

  1. Application path –  files under ./application/ are always checked first.
  2. Module paths – set modules(['./modules/my-module']) to enable module loading. Files from modules are loaded from in the order they are added.
  3. System path – files under ./system/ are loaded if no alternative exists.

Assumptions about file and class names

Files are assumed to be lowercase. Underscores in class names are replaced by slashes (so Controller_User becomes ./application/classes/ controller/user.js).

Performance impact

Requests are cached, so that additional calls to find_file() do not cause additional stat() calls in the filesystem. This is insignificant anyway, since Node.js servers are persistent so the cascading search is only done once per server instance for each file (not once per request).

Loading 3rd party code

The loaded files do not need to be “compatible” in any way other than layout in the file system. For example, while Cfs.factory(‘some_other_lib’) loads the file from ./application/ classes/some/other/lib.js, that file does not actually need to contain a class named some_other_lib; just that it returns something via module.exports.

Methods

The methods are:

  • Cfs.modules(['./modules/path-to-module']) – set the modules directories to search.
  • Cfs.find_file(dir, file, ext) – Search each path under dir (e.g. ‘classes’, ‘views’) for file (filename) with the extension (ext, default is “.js”).
  • Cfs.factory(class_name) – Return a new instance of the given class after loading the corresponding file from the cascading file system. Note that classes should be in the classes subdirectory.
  • Cfs.load(class_name) – Return whatever require(file-which-contains-the-class) returns. Useful for extending classes, see below for an example

Example usage:

var Cfs = require('./cfs.js');
 
// test class loading:
// e.g. check ./application/classes/test.js
// ./modules/modulename/classes/test.js
// ./system/classes/test.js
var t = Cfs.factory('test');
t.run();
 
// test view loading
// e.g ./application/views/user/index.html
// ./modules/modulename/views/user/index.html
// ./system/views/user/index.html
fs.readFile(Cfs.find_file('views', 'user/index', '.html'), function (err, data) {
  if (err) throw err;
  sys.puts(data);
});

To set modules:

// set only once, before calling any other functions!
Cfs.modules([
         "./modules/testmodule/",
         "./modules/testmodule2/",
         ]);

Extending classes:

// test extending class (see code in /application/classes/controller/extend.js
// to see how extension is achieved)
// e.g. ./application/classes/controller/extend.js
// ./modules/modulename/classes/controller/extend.js
// ./system/classes/controller/extend.js
var t3 = Cfs.factory('Controller_Extend');
t3.run();
t3.run_parent();

Note that if you put cfs.js in ~/node_modules/cfs.js, you don’t need to specify the path to it… see Modules in node.js docs.

// in extend.js:
var Controller_Extend = function () {
}
// extend the class
var util = require('util'), Cfs = require('../../../../cfs.js');
util.inherits(Controller_Extend, Cfs.load('Controller_Base'));
 
Controller_Extend.prototype.run = function() {
   console.log("Controller_Extend from testmodule2.");
};
Controller_Extend.prototype.run_parent = function() {
   // run the parent function
   Controller_Extend.super_.prototype.run();
};
 
module.exports = Controller_Extend;

3
Feb 11

Javascript, node.js and for loops

What does this code print out? Assume that console.log logs to the console.

Experiment #1: For loop

console.log('For loop');
for(var i = 0; i < 5; i++) {
 console.log(i);
}

0, 1, 2, 3, 4 - easy, right? What about this code?

Experiment #2: setTimeout

console.log('setTimeout');
for(var i = 0; i < 5; i++) {
  setTimeout(function() {console.log('st:'+i)}, 0);
}

The result is 5, 5, 5, 5, 5.What about this?

Experiment #3: Callback function

function wrap(callback) {
  callback();
}
 
console.log('Simple wrap');
for(var i = 0; i < 5; i++) {
  wrap(function() {console.log(i)});
}

0, 1, 2, 3, 4 — right? (Yup.) And this?

Experiment #4: While loop emulating sleep

function sleep(callback) {
  var now = new Date().getTime();
  while(new Date().getTime() < now + 1000) {
   // do nothing
  }
  callback();
}
 
console.log('Sleep');
for(var i = 0; i < 5; i++) {
  sleep(function() {console.log(i)});
}

0, 1, 2, 3, 4. And this?

Experiment #5: Node.js process.nextTick

console.log('nextTick');
for(var i = 0; i < 5; i++) {
 process.nextTick(function() {console.log('nt:'+i)});
}

Well… it’s 5, 5, 5, 5, 5.

Experiment #6: Delayed calls

var data = [];
for (var i = 0; i < 5; i++) {
  data[i] = function foo() {
    alert(i);
  };
}
data[0](); data[1](); data[2](); data[3](); data[4]();

Again, 5, 5, 5, 5, 5.

Ok, I’m confused. Why does this happen?

Looking at experiments #1 to #6, you can see a pattern emerge: delayed calls, whether they are via setTimeout(), Node.js-specific process.nextTick() or a simple array of functions all print the unexpected result “5″.

Fundamentally, the only thing that matters is at what time the function code is executed. setTimeout() and process.nextTick() ensure that the function is only executed at some later stage. Similarly, assigning functions into an array explicitly like in Experiment #6 means that the code within the function is only executed after the loop has been completed.

There are three things you need to remember about Javascript:

  1. Variable scope is based on the nesting of functions. In other words, the position of the function in the source always determines what variables can be accessed; nested functions can access their parent’s variables, non-nested functions can only access the topmost, global variables.
  2. Functions can create new scopes; the default behavior is to access previous scope.
  3. Some functions have the side-effect of being event-driven and executed later, rather than immediately. You can emulate this yourself by storing but not executing functions, see Experiment #6.

What we would expect, based on experience in other languages, is that in the for loop, calling the function would result in a call-by-value (since we are passing a primitive – an integer) and that function calls would run using a copy of that value at the time when the part of the code was “passed over” (e.g. when the surrounding code was executed). That’s not what happens:

A nested function does not get a copy of the value of the variable — it gets a live reference to the variable itself and can access it at a much later stage. So while the reference to i is valid in both experiment 2, 5, and 6 they refer to the value of i at the time of their execution – which is on the next event loop – which is after the loop has run – which is why they get the value 5.

Functions can create new scopes but they do not have to. The default behavior allows us to refer back to the previous scope (all the way up to the global scope); this is why code executing at a later stage can still access i. Because no variable i exists in the current scope, the i from the parent scope is used; because the parent has already executed, the value of i is 5.

Hence, we can fix the problem by explicitly establishing a new scope every time the loop is executed; then referring back to that new inner scope later.  The only way to do this is to use an (anonymous) function plus explicitly defining a variable in that scope. There are two ways to do this:

Option 1) We can allow the value of i to “leak” from the previous scope, but explicitly establish a new variable j in the new scope to hold that value for future execution of nested functions:

Experiment #7: Closure with new scope establishing a new variable

console.log('new scope nexttick with value binding in new func scope');
for(var i = 0; i < 5; i++) {
 (function() {
  var j = i;
  process.nextTick(function() {console.log('nexttick-new-scope-new-bind:'+j)});
 })();
}

Resulting in 0, 1, 2, 3, 4. Accessing j returns the value of i at the time when the closure was executed – and as you can see, we are immediately executing the function by appending ();

We need to have that wrapping function, because only functions establish new scope. In fact, we are establishing five new scopes when the loop is run, each iteration creating a scope with its own, separate variable j with a different value (0, 1, 2, 3, 4); each accessible from the inner closure at the time the code in it is run. Without the wrapping closure the reference to j in the innermost closure would end up having the same scope as i; it would then have the value of i at the time of the execution; which would be 5.

Options 2: Or we can pass the value to the new scope as a parameter:

Experiment #8: Settimeout in closure with new scope

console.log('new scope');
for(var i = 0; i < 5; i++) {
 (function(i) {
  setTimeout(function() {console.log('st2:'+i)}, 0);
 })(i);
}

Resulting in 0, 1, 2, 3, 4.

Now you should remember one more rule to understand the second solution:

  • Functions can be passed as data; they are only evaluated when explicitly evaluated (e.g. by appending () or by using function.call or function.apply).

So when we have (function(param))(param), we are calling the function immediately and parameters always establish a new variable/identifier in the function scope; that allows us to use the i from the new scope  in our delayed function call – since it is bound to the parameter, not to the parent scope.

This also means that this does NOT work (process.nextTick is interchangeable with setTimeout):

Experiment #9: Closure with new scope containing callback triggered on process.nextTick

console.log('new scope nexttick');
for(var i = 0; i < 5; i++) {
 (function() {
  process.nextTick(function() {console.log('nexttick-new-scope:'+i)});
 })();
}

5, 5, 5, 5, 5 – since i still refers to the old scope. Compare that with experiment #7, where while the inner code is the same, we actually establish a new variable in the wrapping closure’s scope, which is then referred to by the inner code.

Conclusion

I should note that this has nothing do to with synchronicity or asynchronicity; it is simply the way in which scope resolution works for Javascript when code execution is delayed in some manner while referring to variables defined in the parent scope of the nested code.

In Javascript, all functions store “a hierarchical chain of all parent variable objects, which are above the current function context; the chain is saved to the function at its creation”. Because the scope chain is stored at creation, it is static and the relative nesting of functions precisely determines variable scope. When scope resolution occurs during code execution, the value for a particular identifier such as i is searched from:

  1. first from the parameters given to the function (a.k.a. the activation object)
  2. and then from the statically stored chain of scopes (stored as the function’s internal property on creation) from top (e.g. parent) to bottom (e.g. global scope).

Javascript will keep the full set of variables of each of the statically stored chains accessible even after their execution has completed, storing them in what is called a variable object. Since code that executes later will receive the value in the variable object at that later time, variables referring to the parent scope of nested code end up having “unexpected” results unless we create a new scope when the parent is run, copy the value from the parent to a variable in that new scope and refer to the variable in the new scope.

For a much more detailed explanation, please read Dimitry Soshnikov’s detailed account of ECMA-262 which explains these things in full detail; in particular about Scope chains and Evaluation strategies. His explanations of the details are the best I’ve seen anywhere!


2
Feb 11

Essential Node.js patterns and snippets

In this post, I take a look at the different patterns that you need to know when using Node.js. These came from my own coding and from a look at the code behind Tim Caswell’s flow control libraries. I think it is necessary to know how these basic patterns are implemented even if you use a library..

1. Objects and classes

1.1 Class pattern

// Constructor
var Class = function(value1, value2) {
  this.value1 = value1;
}
// properties and methods
Class.prototype = {
  value1: "default_value",
  method: function(argument) {
    this.value2 = argument + 100;
  }
};
// node.js module export
module.exports = Class;
// constructor call
var object = new Class("Hello", "2");

If the class is long, then instead of doing a single Class.prototype = {…} assignment, it may be split into multiple Class.prototype.method = function () {..} assignments.

Reminder: Assign all your properties some value in your constructor. Otherwise while the resulting object can access the property defined in the prototype, the prototype value is shared among all instances. So in order for your “instance” to actually own it’s own copies, you have to explicitly initialize the variables in the constructor, or they will act like static variables in non-prototype-based OOP. It’s a stupid mistake, don’t make it.

1.2 Accessing global values from objects

// constructor
var Class = function(global, value2) {
  this.global = global;
}
// access using this.global in class methods

1.3 Factory pattern

// Constructor
var Class = function(value1, value2) { ... }
// Factory
Class.factory(value1) { return new Class(value1, "aaa"); }
// properties and methods
Class.prototype = { ... };

1.4 Sharing state between modules

var Common = {
  util: require('util'),
  fs:   require('fs'),
  path: require('path')
};
 
module.exports = Common;
 
// in other modules
var Common = require('./common.js');

1.5 Singleton class (added Feb 2011)

var Singleton = (function() {
   var private_variable = 'value';
   function private_function() {
      ...
   }
   function public_function() {
      ...
   }
  return {
      public_function: public_function
  };
})();

2. Parsing requests

2.1 Parsing GET

// parse URL
var url_parts = url.parse(req.url);
// parse query
var raw = querystring.parse(url_parts.query);
// some juggling e.g. for data from jQuery ajax() calls.
var data = raw ? raw : {};
data = raw.data ? JSON.parse(raw.data) : data;

2.2 Parsing POST

if (req.method == 'POST') {
   var fullBody = '';
   req.on('data', function(chunk) {
   // append the current chunk of data to the fullBody variable
   fullBody += chunk.toString();
   });
   req.on('end', function() {
      // parse the received body data
      var decodedBody = querystring.parse(fullBody);
      console.log(decodedBody);
   }
}

3. Concurrency

3.1 Waiting for async stuff to complete before continuing

E.g. when you need to have all the results from the database before you do something.

var wait = function(callbacks, done) {
   var counter = callbacks.length;
   var next = function() {
      if(--counter == 0) {
         done();
      }
   };
   for(var i = 0; i < callbacks.length; i++) {
      callbacks[i](next);
   }
}

Example usage (if you prefer, imagine that these are three database calls and that you are storing the results in some higher-scope variable in each of them and then using that result in function d):

var a = function (next) {
   setTimeout( function() {
      console.log("Done A");
      next();
   }, 3000);
  };
 
var b = function (next) {
   setTimeout( function() {
      console.log("Done B");
      next();
   }, 2000);
  };
 
var c = function (next) {
   setTimeout( function() {
      console.log("Done C");
      next();
   }, 1000);
  };
 
var d = function () {
   console.log("All done!");
  };
 
wait([a, b, c], d );

Similar libraries include: Tim Caswell’s Step and Will Conant’s Flow.exec(). This code is simpler so it doesn’t use this to pass the function next(); but rather passes it explicitly. Also it needs an array, instead of accepting an arbitrary number of function arguments. The library functions do better error handling and have more features, so you might want to use them / look at them to improve the code.

3.2 Limiting concurrency

E.g. reading a gazillion files but just running 30 reads at a time not to exhaust the available file handles. You have a list of operations to do, you want to do them all but can’t start/don’t want to have more than max_concurrency number of the operations running simultaneously.

I call this the Pile, but there probably is a better name for it. Put your stuff in the pile, and then run it all, finally call done() when everything is done. Main difference with simple completion counters like Wait() above is that this code limits concurrent execution, which is necessary in some cases (e.g. reading files).

var Pile = function() {
   this.pile = [];
   this.concurrency = 0;
   this.done = null;
   this.max_concurrency = 10;
}
Pile.prototype = {
  add: function(callback) {
   this.pile.push(callback);
  },
  run: function(done, max_concurrency) {
      this.done = done || this.done;
      this.max_concurrency = max_concurrency || this.max_concurrency;
      var target = this.pile.length;
      var that = this;
      var next = function() {
         that.concurrency--;
         (--target == 0 ? that.done() : that.run());
      };
      while(this.concurrency < this.max_concurrency && this.pile.length > 0) {
         this.concurrency++;
         var callback = this.pile.shift();
         callback(next);
      }
   }
};

Example usage (add 20 functions, then run em at concurrency of 5 at a time). Again, imagine that setTimeout an async I/O call.

Note: you have to call next() when you’re done.

pilex = new Pile();
 
var counter = 0;
 
for(var i = 0; i < 20; i++) {
   pilex.add( function test(next) {
      var now = new Date().getTime();
      setTimeout( function() {
         counter++;
         console.log(counter +" Hello world");
         next();
      }, 5000);
     }
   );
}
pilex.run(function() {console.log("Done "+counter);}, 5);


3.3 Pooling and reusing expensive, persistent resources

I recommend using node-pool, since the management code is rather involved if you want to timeout/renew objects in the pool.

3.4 Running arbitrary workflows when dependencies are matched

If you can split your overall task into several independent async workflows, then Conductor seems like a nice solution since it does dependency resolving for you.

4. More good basic node.js patterns/snippets?

Leave a comment, write a Gist, write a blog post or send me a link to your repository + explain what it is and when/why it should be used. I want your code, will acknowledge your stuff and will keep periodically updating this page since I want to use it for my own reference/reminder. Thanks!



				

1
Feb 11

Understanding the node.js event loop

The first basic thesis of node.js is that I/O is expensive:



So the largest waste with current programming technologies comes from waiting for I/O to complete. There are several ways in which one can deal with the performance impact (from Sam Rushing):

  • synchronous: you handle one request at a time, each in turn. pros: simple cons: any one request can hold up all the other requests
  • fork a new process: you start a new process to handle each request. pros: easy cons: does not scale well, hundreds of connections means hundreds of processes. fork() is the Unix programmer’s hammer. Because it’s available, every problem looks like a nail. It’s usually overkill
  • threads: start a new thread to handle each request. pros: easy, and kinder to the kernel than using fork, since threads usually have much less overhead cons: your machine may not have threads, and threaded programming can get very complicated very fast, with worries about controlling access to shared resources.

The second basis thesis is that thread-per-connection is memory-expensive: [e.g. that graph everyone showns about Apache sucking up memory compared to Nginx]

Apache is multithreaded: it spawns a thread per request (or process, it depends on the conf). You can see how that overhead eats up memory as the number of concurrent connections increases and more threads are needed to serve multiple simulataneous clients. Nginx and Node.js are not multithreaded, because threads and processes carry a heavy memory cost. They are single-threaded, but event-based. This eliminates the overhead created by thousands of threads/processes by handling many connections in a single thread.

Node.js keeps a single thread for your code…

It really is a single thread running: you can’t do any parallel code execution; doing a “sleep” for example will block the server for one second:

while(new Date().getTime() < now + 1000) {
   // do nothing
}

So while that code is running, node.js will not respond to any other requests from clients, since it only has one thread for executing your code. Or if you would have some CPU -intensive code, say, for resizing images, that would still block all other requests.

…however, everything runs in parallel except your code

There is no way of making code run in parallel within a single request. However, all I/O is evented and asynchronous, so the following won’t block the server:

c.query(
   'SELECT SLEEP(20);',
   function (err, results, fields) {
     if (err) {
       throw err;
     }
     res.writeHead(200, {'Content-Type': 'text/html'});
     res.end('<html><head><title>Hello</title></head><body><h1>Return from async DB query</h1></body></html>');
     c.end();
    }
);
If you do that in one request, other requests can be processed just fine while the database is running it’s sleep.

Why is this good? When do we go from sync to async/parallel execution?

Having synchronous execution is good, because it simplifies writing code (compared to threads, where concurrency issues have a tendency to result in WTFs).

In node.js, you aren’t supposed to worry about what happens in the backend: just use callbacks when you are doing I/O; and you are guaranteed that your code is never interrupted and that doing I/O will not block other requests without having to incur the costs of thread/process per request (e.g. memory overhead in Apache).

Having asynchronous I/O is good, because I/O is more expensive than most code and we should be doing something better than just waiting for I/O.

An event loop is “an entity that handles and processes external events and converts them into callback invocations”. So I/O calls are the points at which Node.js can switch from one request to another. At an I/O call, your code saves the callback and returns control to the node.js runtime environment. The callback will be called later when the data actually is available.

Of course, on the backend, there are threads and processes for DB access and process execution. However, these are not explicitly exposed to your code, so you can’t worry about them other than by knowing that I/O interactions e.g. with the database, or with other processes will be asynchronous from the perspective of each request since the results from those threads are returned via the event loop to your code. Compared to the Apache model, there are a lot less threads and thread overhead, since threads aren’t needed for each connection; just when you absolutely positively must have something else running in parallel and even then the management is handled by Node.js.

Other than I/O calls, Node.js expects that all requests return quickly; e.g. CPU-intensive work should be split off to another process with which you can interact as with events, or by using an abstraction like WebWorkers. This (obviously) means that you can’t parallelize your code without another thread in the background with which you interact via events. Basically, all objects which emit events (e.g. are instances of EventEmitter) support asynchronous evented interaction and you can interact with blocking code in this manner e.g. using files, sockets or child processes all of which are EventEmitters in Node.js. Multicore can be done using this approach; see also: node-http-proxy.

Internal implementation

Internally, node.js relies on libev to provide the event loop, which is supplemented by libeio which uses pooled threads to provide asynchronous I/O. To learn even more,  have a look at the libev documentation.

So how do we do async in Node.js?

Tim Caswell describes the patterns in his excellent presentation:

  • First-class functions. E.g. we pass around functions as data, shuffle them around and execute them when needed.
  • Function composition. Also known as having anonymous functions or closures that are executed after something happens in the evented I/O.
  • Callback counters. For evented callbacks, you cannot guarantee that I/O events are generated in any particular order. So if you need multiple queries to complete, usually you just keep count of any parallel I/O operations, and check that all the necessary operations have completed when you absolutely must wait for the result; e.g by counting the number of returned DB queries in the event callback and only going further when you have all the data. The queries will run in parallel provided that the I/O library supports this (e.g. via connection pooling).
  • Event loops. As mentioned earlier, you can wrap blocking code into an evented abstraction e.g. by running a child process and returning data as it it is processed.

It really is that simple!


30
Dec 10

Learning node.js: my experiences and helpful resources

Here are my notes from learning node.js over the winter vacation. You can think of this post as asking me what resources did you find helpful in learning node.js? There is a lot of node.js stuff out there, but few posts on what is noteworthy, popular or tricky - I am sick of seeing posts explaining what node.js is and showing me the same five lines of code.

Structuring code

I recommend reading the series on control flow in node.js from Tim Caswell:

Scope and this are different/tricky in JS, read this and this.

Update Feb 2011:

Update Jan 2011:

Inheritance patterns in Javascript (using a modern style that optimizes well when using the Google Closure compiler).

You’ll probably run into problems with assigning values if you use the class pattern. The problems is that adding variables in the prototype makes them shared among instances. You’ll have to explicitly initialize any per-instance variables in the constructor, or otherwise the object instance will keep accessing the prototype property => property acts like it would be static. Obvious in retrospect, but hindsight makes fools of us all.

Modules

Have a look at the node.js modules page on github for a comprehensive list of modules. Install npm to install packages. Or don’t, just drop the repos in a subdirectory and use require, since require in node.js is flexible.

Debugging

Simply printing out stuff can be done using sys.puts(string) (first “var sys = require(‘sys’)”).

Since you are on V8, you can use JSON.stringify(object) natively. console.log() also works, it automatically prettyprints objects, but only one level deep (no nested objects).

For those rare occasions where JSON.stringify(object) fails (e.g. due to recursion) you can use require(‘util’).inspect(object) as an alternative.

Setting up the server

Update: Here is how I got Node.js up and running on Centos 5 with nginx + monit.

First, you’ll want to decide how you want to build your applications – will you use node.js for the whole stack, or will you mix it with more familiar scripting languages. If you want to mix two technology stacks, I strongly recommend NOT trying to combine node.js with other web app stacks using subdomains, since you’ll have an unlimited amount of trouble due to the same origin policy enforcement that is built into all browsers. You can either solve that problem for all browsers:

  • IE6/7 has no mechanism,
  • IE8/9 has XDomainRequest,
  • Mozilla/Chrome/Opera have XMLHttpRequest Level 2 (note that level 2 is needed for cross-domain support).

Or you can solve the problem on the server:

  1. by only using node.js for all of your stack, running a bare node.js server
  2. by running a mixed stack using Nginx or Apache to proxy node.js requests

I’m not quite ready to start from scratch, given the productivity that my non-node-js stack gives; I did the mixed route with Apache which is adequate for trying things out. If you want to scale a bit more, you’ll want to setup Nginx.

You can proxy requests from Apache to Node.js (second example), or do a rewrite. You’ll want to use a subdirectory for node.js requests so that you don’t have to deal with the same origin policy. Here is what I did:

<VirtualHost *:80>
    ServerAdmin webmaster@localhost
    DocumentRoot /path/to/www/
    ServerName example.com
    ProxyPass /node/ http://127.0.0.1:8001/ retry=0 timeout=120
    <Proxy *>
        Allow from all
    </Proxy>
</VirtualHost>

The “retry=0″ parameter prevents Apache from waiting 60 seconds if a node.js response fails (e.g. due to server restart).

Comet with node.js

Have a look at socket.io (github page) for streaming. In my testing, I couldn’t get it to work with IE9 (sending stuff) and Chrome seemed to keep dropping connections. Firefox worked solidly and Chrome after I set it to reconnect, but I couldn’t figure out why IE9 was not working (it could receive messages, but sending them did not seem to work..). I am sure that socket.io will get there, reliable real-time transmission in all major browsers just isn’t quite yet a problem that has been solved neatly. There are still issues to resolve, it seems. To be fair, that’s for cross-domain requests.

If you use socket.io, you will probably want to implement your own abstractions over it, since you’ll want to do something with channels. There is the broadcast(message, list_of_excluded_user_ids) method, but you probably want to have more fine-grained control which can be done via additional abstractions.

There are also two other promising comet projects: Push-it and Faye. Push-it (githubstackoverflow) is built on top of socket.io. Faye (homepage, github) also looks interesting but I spotted this to-do on the repo: “Detect failed WebSocket connection and fall back to polling transports” – e.g. it doesn’t seem to work in non-Websocket browsers (IE6/IE7), something I would like to have (in fact right now, Dec 2010 WebSockets is disabled in newer FF 4 builds and Opera 11 due to security concerns).

Connecting to MySQL

The top three MySQL bindings are felixge’s node-mysql, Sannis’ node-mysql-libmysqlclient and sidorare’s nodejs-mysql-native.

node-mysql and nodejs-mysql-native are pure node.js clients, while node-mysql-libmysqlclient uses libmysqlclient.

Sannis publishes benchmarks of  the prominent node.js MySQL bindings which point to node-mysql-libmysqlclient being the fastest. Looking at the GitHub stats for the different projects, felixge’s implementation is most popular (most followed and forked). Both Sannis and felixge are actively committing (Sannis has daily commits, felixge approximately weekly). sidorare’s last commit is from August (3 months ago; checked Dec 2010) so the project seems to be less active. Regarding performance, felixge notes (prior to developing node-mysql):

“Performance should be a secondary concern here. Show me a realistic MySql scenario where you perform 170k queries / sec against a single database server. Inserting 13k records / sec doesn’t sound like a good use case for MySql either. The mysql driver is pretty unlikely to become a bottleneck in the real world. Anyway, I think there is plenty of room for improvement.”

So basically it comes down to whether you prefer to have an all-node.js solution or a slightly faster libmysqlclient-dependent solution.

I went with node-mysql, which works really nicely. What I am bit unsure of is how I can make sure that the code I write using the library performs well, I’d love to see a better explanation of what goes on inside node-mysql when I do a connect or a query…

Once you get that done, you’ll want to look into connection pooling. Have a look at node-pool.

Or, you might opt to just have one shared persistent connection to MySQL. Apparently, this is what Felix’s company does or did for quite a while.

Deploying node.js with monitoring

Performance/maturity discussion

Amit Dalihefendic from Plurk writes about the performance of node.js and notes that while they were able to serve millions of real customer notifications during a 8-month period; however, they decided to go with Java and Netty due to current performance problems. Amit attributes the limitations they ran into to the V8 engine – which makes some assumptions which make sense on a browser, but which are problematic on a server (Igor Sysoev, the author of nginx, summarizes this as “V8 will work well in any program, provided that the program is called Chrome“; via Google Translate). On Hacker News, Amit concludes:

“I think the ultimate perfomance is found in pure java.nio/C/C++ solutions, but I think having a bit slower perfomance and higher abstraction is better since it makes it much easier to maintain and debug the system. My general impression of java.nio is that it’s very low-level and a generally hard to code against and it’s the main reason why we didn’t choose it.

This said, I think node.js offers great usability while perfomance is pretty good. So if I was developing a new comet solution I would give node a go – you can always rewrite to something more low level once you begin to hit limits. IMO going after java.nio directly is a premature optimization and most projects won’t hit limits with node.js.”

I figure this is a fair conclusion.

Testing and debugging

Frameworks

Express.js seems to be the most popular web application framework for node.js. Geddy is another (popular?) alternative. Connect provides middleware for framework development, so it is more bare-bones — but since node.js app development is still in the early stages, I’ve seen it used in many repositories. Express is built on Connect, for example. Check out this review of node.js frameworks from May 2010. Since I opted not to write everything in node.js, I’ll have to get back to you on the experiences with these.

Tutorial series

Other interesting links