Technology


22
Nov 11

Performance benchmarking Socket.io 0.8.7, 0.7.11 and 0.6.17 and Node’s native TCP

I’ve been working with Socket.io quite a bit recently. It’s a great library. However, after upgrading to 0.8.x, I ran into problems with increased CPU usage. Since performance is very important for high traffic pubsub implementations, I decided to investigate this further – and try to quantify the performance impact of upgrading to a newer version of Socket.io.

I wrote a benchmarking suite (siobench). The benchmark is rather simple. Clients connect one at a time, and a new client is only allowed to connect when the previous one is connected. When the server has used up 5000 milliseconds of CPU time, the benchmark is stopped. Every second, every connected client sends a single message which is echoed back by the server (more details).

This workload is geared towards a situation where Socket.io is used to notify people of things as part of a larger application: e.g. most of the load is assumed to be idling connections rather than real-time messaging like in, say, a multiplayer game.

The “end of test” condition is 5000 ms of CPU time, because this seemed to be a easy way to give all implementations the same amount of time. CPU usage % is not accurate, since it is dependent on how much CPU time the process gets over a particular amount of wallclock time. In the graphs the CPU usage % calculated over a 100ms interval, while usertime and systime are the actual numbers reported at that particular time.

Summary

Node (0.4.12) using tcp ~ 8000 connections on a single core
socket.io 0.6.17 using websockets ~ 2300 connections on a single core
socket.io 0.7.11 using websockets ~ 1800 connections on a single core
socket.io 0.8.6 using websockets ~ 1900 connections on a single core

Remember, this is just one server on one core, with 5000 ms of CPU time on that core. The rest of the cores are used to generate sufficient load. The full graphs are at the end of the post.

Note that the absolute numbers are mostly unimportant – I ran this on the following 15″ Macbook Pro running Arch with the 3.1.04 Linux kernel in Virtualbox with 4096 Mb of RAM, a SSD and four cores (Intel(R) Core(TM) i7-2635QM CPU @ 2.00GHz GenuineIntel GNU/Linux). You can get numbers that are more representative of your system by getting siobench and running it:

Usage: node siobench.js [env]
A tool for benchmarking your Socket.io server.

Available environments:
	0.6.17
	0.6.17_poll
	0.7.11
	0.8.7
	0.8.7_poll
	tcp

You can also write your own benchmarks under ./bench, by writing a new server.js (example #1, #2) and a new client.js (example #1, #2). Each benchmark has it’s own set of npm dependencies installed, so that one can run benchmarks against many versions of socket.io.

Some notes on performance

The relative performance is more interesting.

First, the node TCP speed represents the highest achievable performance on this benchmark, since it only uses the built-in TCP implementation. Compared to this, Socket.io is has about 1/3 of the performance (~ 2300 vs ~8000 connections) when using WebSockets.

Second, it appears that 0.8.7 is about 20% slower than 0.6.17 on this benchmark. If I remember correctly, Socket.io 0.7 switched to a new protocol, and there are clearly some performance improvements over 0.7.11 in 0.8.7 (+100 connections in this bench); it’s just that the overall performance is still worse in this benchmark than in the old 0.6.17 branch.

Working towards higher-performance

As this is just a simple benchmark, I don’t really have solutions – only some suggestions.

1) A CI build that includes benchmarks and community contributed test cases

First, I’d love to see a CI build for Socket.io that would include performance benchmarks and community contributed test cases.

However, currently setting up a CI build for Socket.io is difficult because the bundled test suite only works on OSX. It would be a lot easier to contribute if the tests worked on other platforms.

I am hoping that as Engine.io gets going, the test suite will be fixed so that it can be run on other platforms. Otherwise, contributing improvements will be tricky/impossible since there is no way to tell whether the code works.

2) More realistic performance test scenarios

The current test scenario is rather limited in that it mostly tests performance in terms of establishing connections (without terminating them). I’d love to hear more realistic scenario suggestions, particularly from people who have run into memory usage issues.

siobench is only a starting point: it’s way better than just looking at htop and wondering whether performance was better in the last version or not. There are still specific questions that should be formulated as replicable tests.

3) A polling transport that works on Node.js

I did write tests for the xhr-polling transport for Socket.io as well. These showed much worse performance, around:

  • ~ 550 connections on Socket.io 0.6.17 (vs ~2300 using WS)
  • ~ 450 connections on Socket.io 0.8.7 (vs ~ 1900 using WS)

However, the xhr-polling is severely broken in that it stops connecting after 4-5 connections on Node v0.4.12. So I had to force each load generating client to only make four connections and then spawn a new load generating process to work around the problem. I wouldn’t vouch for the accuracy of the test with xhr-polling until the xhr-polling transport is fixed on Node when using socket.io-client (it’s been broken for the last three releases, though).

4) Comparative benchmarks

Hopefully, this will help with performance testing new releases of Socket.io and other Comet libraries. Since the plan is that Engine.io will allow people to work with a lower level than Socket.io, there might be new performance oriented versions, and it would be useful to see benchmarks for those. Re: the other Node.js pubsub frameworks: I can’t benchmark Faye, because it does not provide the right API out of the box, and Juggernaut uses Socket.io internally.

I’m going to use siobench it for internal testing to ensure that the pubsub implementation I am working on (built over Socket.io) will not have performance regressions.

The full graphs are below. Please leave comments and suggestions for improvements – I am hoping that the developer community around Socket.io can help in improving the performance going forward, kind of like what Mozilla did with “arewefastyet.com“.

Socket.io 0.6.17 – Websockets – CPU usage and time

 

Socket.io 0.6.17 – Websockets – resident set size


 

Socket.io 0.7.11 – Websockets – CPU usage and time

Socket.io 0.7.11 – Websockets – resident set size

Socket.io 0.8.7 – Websockets – CPU usage and time

Socket.io 0.8.7 – Websockets – resident set size

Node 0.4.12 – TCP – CPU usage and time

Node 0.4.12 – TCP – resident set size

 

 


10
Nov 11

My Arch Linux setup

This is mostly just a reminder for myself – but I always learn new things when I read how other people set up their system. Leave a comment if you have a tip – that’s how I learned about wicd-gtk :) . Oh, and install my window manager (tiling, written in C++ and node.js, configurable using Javascript).

First steps

  • do the basic arch setup first (or VMware, or Virtualbox)

Update the system (and install/setup sudo)

dhcpcd eth0 #if you did not add "interface=eth0" in rc.conf during setup
pacman -Syu

Fixes:

http://www.archlinux.org/news/initscripts-update-manual-intervention-required/

rm /etc/profile.d/locale.sh

http://www.archlinux.org/news/filesystem-upgrade-manual-intervention-required/

pacman -S filesystem --force
pacman -S sudo
vim /etc/sudoers # add yourself to sudoers
sudo vim /etc/pacman.conf # set SigLevel = Never TrustAll
sudo shutdown-r now

Install X11:

pacman -S xorg-server xorg-xinit xorg-utils xorg-server-utils xterm

If virtualized in VirtualBox, make copy-paste work first:

pacman -S virtualbox-archlinux-additions

Details

Create a new user

pacman -S zsh
useradd -m -g users -G audio,lp,optical,storage,video,games,power,scanner -s /bin/zsh USERNAME
su USERNAME
passwd

Add x11

pacman -S xorg-server xorg-xinit xorg-utils xorg-server-utils

Copy ssh keys over from old machine
Useful packages

pacman -S base-devel sudo python2 git libev mlocate mercurial nitrogen \
sakura wicd-gtk pcmanfm gnome-icon-theme htop unzip \
openssl chromium flashplugin bash-completion xterm \
epdfview mysql ruby tilda tmux wget redis xcursor-vanilla-dmz \
xarchiver gzip bzip2 zip unzip unrar p7zip \
meld ttf-ubuntu-font-family mpg123 alsa-utils redis mysql ruby libxslt
  • base-devel and python2 for compiling node
  • libev for nwm
  • mlocate for locate command
  • nitrogen is better than feh for multiple screens
  • sakura is a nice terminal
  • wicd-gtk is a simple wifi network gui
  • pcmanfm gnome-icon-theme are for pcmanfm, a Nautilus alternative

Remember to visudo and remove the password requirement from wheel. And add dbus and wicd to /etc/rc.conf just like pacman tells you to.
Configuring X11

Add ~/.Xresources:

Xcursor.theme: vanilla-dmz
Xcursor.size:  16       !  32, 48 or 64 may also be good values

Configuring git

git config --global color.ui true

Configuring sakura

I want to use ctrl + Page_Up / Page_Down to switch tabs, so edit ~/.config/sakura/sakura.conf:

switch_tab_accelerator=4 # since GDK_CONTROL_MASK is 1 << 2, e.g. 4.
prev_tab_key=Page_Down
next_tab_key=Page_Up

Some basic niceties: whatprovides, service and chkconfig

pacman -S pkgtools
  • pkgtools provides the pkgfile tool. It works like yum whatprovides (e.g. allows you to search for a particular command or dependency in all the pacman packages)
  • “sudo pkgfile -u” to update the db
  • “pgkfile zipinfo” to search for zipinfo

Arch don’t have a service and chkconfig, but we can make the new things curve a bit less steep by adding some functions to .bashrc:

function service() {
  sudo /etc/rc.d/$1 $2
}

alias chkconfig='cat /etc/rc.conf | grep DAEMONS && echo "cat + grep /etc/rc.conf"'

This makes service an alias for /etc/rc.d/ and prints out the enabled services from /etc/rc.conf. While we’re editing .bashrc, might as well add:

PS1="[\W]\$ " # my preferred bash prompt (e.g. only the current dirname).
ulimit -s 16400 # higher stack limit
# ssh-agent
SSH_ENV="$HOME/.ssh/environment"
function start_agent {
     echo "Initialising new SSH agent..."
     /usr/bin/ssh-agent | sed 's/^echo/#echo/' > "${SSH_ENV}"
     echo succeeded
     chmod 600 "${SSH_ENV}"
     . "${SSH_ENV}" > /dev/null
     /usr/bin/ssh-add;
}
# Source SSH settings, if applicable
if [ -f "${SSH_ENV}" ]; then
     . "${SSH_ENV}" > /dev/null
     #ps ${SSH_AGENT_PID} doesn't work under cywgin
     ps -ef | grep ${SSH_AGENT_PID} | grep ssh-agent$ > /dev/null || {
         start_agent;
     }
else
     start_agent;
fi

Installing Node and NPM

You can just do:

pacman -S nodejs

If you’re OK with that version, which seems to track the Node releases pretty well.

Arch uses python3 as python. You need to change python to python2 (thanks Rob Searles!)

# node.js fix for arch (use python2) 
mkdir /tmp/bin
ln -s /usr/bin/python2 /tmp/bin/python
export PATH=/tmp/bin:$PATH

You can then do a regular node install:

git clone git://github.com/joyent/node.git

git checkout v0.4.12

./configure

make

sudo make install

Remember to install npm as well:

curl http://npmjs.org/install.sh | sudo sh

Installing my window manager and personal config

git clone git://github.com/mixu/nwm.git
cd nwm
node-waf clean || true && node-waf configure build
sudo npm link # add a global npm symlink to this repository - so nwm-user can find it (man npm link)
cd ..
git clone git://github.com/mixu/nwm-user.git
cd nwm-user
npm link nwm # now make a symlink to the nwm installation

Add it to ~/.xinitrc (change paths!!):

exec /usr/local/bin/node ~/mnt/nwm-user/nwm-user.js 2>~/nwm.err.log 1>~/nwm.log

And while we’re at it, lets add some other stuff:

VBoxClient-all &
export PATH=/tmp/bin:$PATH # for node-waf, too lazy to work on a better solution
xset +fp /usr/share/fonts/local
xset fp rehash

Run “startx” to start X11 with nwm.

Installing my mp3 player

First, we need to configure alsa (included by default):

pacman -S mpg123 alsa-utils

Run:

alsamixer

and turn on Master and PCM channels (by pressing m) as they are muted by default.

sudo alsactl store

Then continue:

git clone git://github.com/mixu/nplay.git

Run nplay with

node nplay.js

TODO: fix directory in source code and change backend from mpg321 to mpg123.

Switching the keyboard language in X11

I sometimes need to write emails in Finnish, so here is how to switch the layout:

setxkbmap -layout fi # revert back setxkbmap -layout us 

Install yaourt

Install dependencies

yaourt libpng12 gtk2-theme-dust

 

Install Sublime Text 2

Sublime Text needs libpng12, which you have to install from AUR:

wget http://aur.archlinux.org/packages/li/libpng12/libpng12.tar.gz
tar -xzvf libpng12.tar.gz
cd libpng12
makepkg
pacman -U ./libpng12-1.2.46-2-x86_64.pkg.tar.xz

Then download and run Sublime Text 2.

Also, you might want to ger http://aur.archlinux.org/packages/gt/gtk2-theme-dust/gtk2-theme-dust.tar.gz.

Configuring Sublime Text 2

locate Packages # returns ~/.config/sublime-text-2/Packages
cd ~/.config/sublime-text-2/Packages
git clone https://github.com/buymeasoda/soda-theme/ "Theme - Soda"
cd User
wget http://blog.mixu.net/files/2010/05/my_themes.zip # Install my themes
unzip my_themes
rm my_themes.zip

Base File settings

{
  // FONTS and COLORS
  "color_scheme": "Packages/User/Mixu Espresso.tmTheme",
  "font_size": 11,
  "tab_size": 2,
  // WHITESPACE
    // Set to true to insert spaces when tab is pressed
    "translate_tabs_to_spaces": true,
    "trim_automatic_white_space": true,
    "trim_trailing_white_space_on_save": true,
    // Set to false to disable detection of tabs vs. spaces on load
    "detect_indentation": false,
  "shift_tab_unindent": true,
  // Set to false to disable highlighting any line with a caret
  "highlight_line": true,
  // Set to "none" to turn off drawing white space, "selection" to draw only the
  // white space within the selection, and "all" to draw all white space
  "draw_white_space": "selection",
  // Set to true to ensure the last line of the file ends in a newline
  // character when saving
  "ensure_newline_at_eof_on_save": true,
  "fold_buttons": false
}

Global Settings

{
  "theme": "Soda Light.sublime-theme"
}

Other tweaks

I put these in my usual “startup” command, nwm-setup.sh and run it manually:

xrandr --output VBOX0 --auto --left-of VBOX1 # Virtualbox displays
chromium & # start chromium
export PS1="[\W]\$ "
xsetroot -cursor_name left_ptr # Set pointer
nitrogen --restore & # Restore desktop background using nitrogen

Other dependencies

pacman -S redis mysql ruby libxslt

Setting up ree:

cd
bash < <(curl -s https://rvm.beginrescueend.com/install/rvm)
echo '[[ -s "$HOME/.rvm/scripts/rvm" ]] && . "$HOME/.rvm/scripts/rvm" # Load RVM function' >> ~/.bashrc
source .bashrc
rvm install ree-1.8.7-2011.03

Installing REE will fail. You need torun the installer from /home/m/.rvm/src/ree-1.8.7-2011.03/installer manually:

./installer -no-tcmalloc

Then continue on:

rvm ree-1.8.7-2011.03 --default

13
Aug 11

Nginx, Websockets, SSL and Socket.IO deployment

I’ve spent some time recently figuring out the options for deploying Websockets with SSL and load balancing – and more specifically, Socket.IO – while allowing for dual stacks (e.g. Node.js and another dev platform). Since there seems to be very little concrete guidance on this topic, here are my notes – I’d love to hear from you on your implementation  (leave a comment or write about and link back)…

The goal here is to:

  1. Expose Socket.io and your main application from a single port — avoiding cross-domain communication
  2. Support HTTPS for both connections — enabling secure messaging
  3. Support the Websockets and Flashsockets transports from Socket.io — for performance
  4. Perform load balancing for both the backends somewhere — for performance

Socket.io’s various transports

Socket.io supports multiple different transports:

  • WebSockets — which are essentially long lived HTTP 1.1 requests, which after a handshake upgrade to the Websockets protocol
  • Flash sockets — which are plain TCP sockets with optional SSL support (but Flash seems to use some older SSL encryption method)
  • various kinds of polling — which work over long lived HTTP 1.0 requests

Starting point: Nginx and Websockets

Nginx is generally the first recommendation for Node.js deployments. It’s a high-performance server and even includes support for proxying requests via the HttpProxyModule.

However, — and this should be made much obvious to people starting with Socket.io — the problem is that while Nginx can talk HTTP/1.1 to the client (browser), it talks HTTP/1.0 to the server. Nginx’s default HttpProxyModule does not support HTTP/1.1, which is needed for Websockets.

Websockets 76 requires support for HTTP/1.1 as the handshake mechanism is not compatible with HTTP/1.0. What this means is that if Nginx is used to reverse proxy a Websockets server (like Socket.io), then the WS connections will fail. So no Websockets for you if you’re behind Nginx.

There is a workaround, but I don’t see the benefit: use a TCP proxy (there is a custom module for this by Weibin Yao, see here ). However, you cannot run another service on the same port (e.g. your main app and Socket.io on port 80) as the TCP proxy does not support routing based on the URL (e.g. /socket.io/ to Socket.io and the rest to the main app), only simple load balancing.

So the benefit gained from doing this is quite marginal: sure, you can use Nginx for load balancing, but you will still be working with alternative ports for your main app and Socket.io.

Alternatives to Nginx

Since you can’t use Nginx and support Websockets,  you’ll need to deal with two separate problems:

  1. How to terminate SSL connections and
  2. How to route HTTP traffic to the right backend based on the URL / load balance

If you want to run two services on the same port, then you will have to terminate SSL connections before doing anything else. There are several alternatives for SSL termination:

  • Stunnel. Supports multiple SSL certificates per process, does simple SSL termination to another port.
  • Stud. Only supports one SSL certificate per invocation, does simple SSL termination to another port.
  • Pound. An SSL-termination-capable reverse proxy and load balancer.
  • Node’s https. Can be made to do anything, but you’ll have to write it yourself.

If you choose Stunnel or Stud, then you need a load balancer as well if you plan on having more than one Node instance in the backend.

HAProxy is not generally compatible with Websockets, but Socket.IO contains code which works around this issue and allows you to use HAProxy. This means that the alternatives are:

  • Stunnel for SSL termination + HAProxy for routing/load balancing
  • Stud for SSL termination + HAProxy for routing/load balancing
  • Pound (SSL and routing/load balancing)

I haven’t looked into Pound more – mainly as I could not find info on it’s TCP reverse proxying capabilities (see the section on Flash sockets below), but it seems to work for these guys.

Setting up Stunnel

The Stunnel part is quite simple:

cert = /path/to/certfile.pem
; Service-level configuration
[https]
accept  = 443
connect = 8443

If you only have one Node instance, you can skip setting up HAProxy, since you don’t need load balancing.

Setting up HAProxy

Would you like Flash Sockets with that?

Note that we need TCP mode in order to support Flash sockets, which do not speak HTTP.

Flash sockets are just plain and simple TCP sockets, which will start by sending the following payload: ‘<policy-file-request/>\0′. They expect to receive a Flash cross domain policy as a response.

Since Flash sockets don’t use HTTP, we need a load balancer which is capable of detecting the protocol of the request, and of forwarding non-HTTP requests to Socket.io.

HAProxy can do that, as it has two different modes of operation:

  • HTTP mode – which allows you to specify the backend based on the URI
  • TCP mode – which can be used to load balance non-HTTP transports.

Main frontend

We accept connections on two ports: 80 (HTTP) and 8443 (Stunnel-terminated HTTPS connections).

By default, everything goes to the backend app at port 3000. Some HTTP paths are selectively routed to socket.io

TCP mode is needed so that Flash socket connections can be passed through, and all non HTTP connections are sent to the TCP mode socket.io backend.

# Main frontend
frontend app
  bind 0.0.0.0:80
  bind 0.0.0.0:8443
  # Mode is TCP
  mode tcp
  # allow for many connections, with long timeout
  maxconn 200000
  timeout client 86400000

  # default to webapp backend
  default_backend webapp

  # two URLs need to go to the node pubsub backend
  acl is_socket_io path_beg /node
  acl is_socket_io path_beg /socket.io
     use_backend socket_io if is_socket_io

   tcp-request inspect-delay 500ms
   tcp-request content accept if HTTP
   use_backend sio_tcp if !HTTP

Port 843: Flash policy

Flash policy should be made available on 843.

# Flash policy frontend
frontend flashpolicy 0.0.0.0:843
   mode tcp
   default_backend sio_tcp

Default backend

This is just for your main application.

backend webapp
   mode http
   option httplog
   option httpclose
   server nginx1s localhost:3000 check

Socket.io backend

Here, we have a bunch of settings in order to allow Websockets connections through HAProxy.

backend socket_io
  mode http
  option                  httplog
  # long timeout
  timeout server 86400000
  # check frequently to allow restarting
  # the node backend
  timeout check 1s
  # add X-Forwarded-For
   option forwardfor
  # Do not use httpclose (= client and server
  # connections get closed), since it will close
  # Websockets connections
  no   option httpclose
  # Use "option http-server-close" to preserve
  # client persistent connections while handling
  # every incoming request individually, dispatching
  # them one after another to servers, in HTTP close mode
  option http-server-close
  option forceclose
  # just one node server at :8000
  server node1 localhost:8000 maxconn 2000 check

Socket.io backend in TCP mode

This is the same server as above, but accessed in TCP mode.

backend sio_tcp
  mode tcp
  server node2 localhost:8000 maxconn 2000 check

Conclusion

The configs above allow you to serve Websockets, Flash and polling from a single port.

However, I am dissatisfied by the complexity of this configuration. In particular, Flash sockets’ TCP requirements are rather painful since they require protocol detection in order to work from a single port.

The alternative is of course to run Socket.io on a different port than your main app. This would mean that you configure HAProxy to just do TCP mode load balancing at that port, with SSL termination in front of HAProxy.

If you do that, you might want to configure a fallback from Nginx at port 80 to Socket.io for those clients who are behind draconian corporate firewalls which disallow ports other than 80 and 443. The fallback will only support long polling and I don’t think Socket.io itself supports automatically switching ports during transport negotiation, but you can detect a failure in Socket.io and re-initialize manually with a different port and polling-only transport.

Do you have a better way? How do you deploy Socket.io? Let me know in the comments below.

 


6
Aug 11

Collaborative git reference

Here is a basic reference for collaborative git commands:

Checkout a remote branch

# list branches first
git branch -a
# * master
# remotes/origin/branch_name
git checkout -b local_name remotes/origin/branch_name

Merging

# switch to master
git checkout master
# merge with experimental branch
git merge experimental

Create a tag

# list tags
git tag
# add a tag
git tag tagname
# push to remote
git push origin master --tags

 

 


11
Jun 11

dwm tips on Fedora

I’ve been testing out Fedora 15′s Gnome 3 and Ubuntu’s Unity, and didn’t like either of them. They both take up too much precious screen space just to show a fancy UI, and requiring hardware acceleration is a pain for low end netbooks and virtual machines.

So I decided to move to an alternative window manager. DWM (dynamic window manager, http://dwm.suckless.org/) is an extremely lightweight tiling window manager written in C which saves screen space and works pretty well as long as you don’t need to connect to wireless networks.

I’ve been pretty happy with it. The main drawback is that connecting to wireless networks is a pain in the ass as there are no proper GUI tools to do this. Check out these tips to get started with dwm

0. Installing dwm on Fedora, keyboard shortcuts

To install DWM, run yum install dwm. You can then choose to use dwm or Gnome or Kde in the login screen.

The default keyboard shortcuts are listed at man dwm or at http://man.suckless.org/dwm/1/dwm.

1. Customizing dwm

Customizing dwm can be done by making changes to config.h and recompiling the window manager.

Fedora has a really nice package called dwm-user, which automates this process! Here is the package description:

dwm-start is a helper script for running and reconfiguring dwm if neccessarry. It’s the preferred way of starting dwm in Fedora.
Running  dwm-start starts Fedora build by default. If you wish to customize your configuration, put the dwm config header file to $HOME/.dwm/config.h and adjust it according to your needs. Every time the user configuration file has changed, dwm-start will rebuild the user dwm binary prior to its execution.

All you need to do is:

sudo yum install dwm-user
mkdir ~/.dwm
cp /usr/src/dwm-user-5.8.2-6.fc14/config.def.h ~/.dwm/config.h

E.g. install via yum, then make a ~/.dwm folder, then copy the config.h file and edit it. When you restart, you can choose dwm-user as your window manager which uses you custom version of dwm. For example, I remapped Meta (Cmd/Windows key) + h and meta + l to meta + pg up / pg down and meta + shift + q to meta + shift + end since I’m currently running Fedora on an OSX host.

You will probably make changes to the keyboard shortcuts. To find the keymap:

sudo updatedb
locate keysymdef.h

Keysymdef.h lists the names of the keys in X11.

2. Tip: Guake is just as awesome on dwm

By default, dwm launches xterm. I prefer to use guake, since that allows me to get the tabbed terminal window on any workspace when I need them. Just launch guake& to run it in the background.

UPDATE: I moved to F15 (deciding simply to ignore Gnome 3) and noticed that guake has problems starting. To fix those:

sudo yum install xfce4-notifyd

Basically you need a notify daemon to allow Guake to print that pretty message “Guake is running”, and xfce4-notifyd provides an alternative notifications daemon.

3. Launch netbeans and other Java programs with font smoothing and GTK look and feel

You need to specify a couple of extra switches to get the GTK look and feel in Java programs, for example:

/home/username/netbeans-7.0/bin/netbeans -J-Dswing.aatext=true -J-Dawt.useSystemAAFontSettings=on –laf com.sun.java.swing.plaf.gtk.GTKLookAndFeel

4. Launch nautilus without the desktop

nautilus --no-desktop

5. Use dwm with a dual screen setup

If dwm starts with mirroring output to your secondary screen, then you need to run xrandr to get the names your screens. E.g. VBOX0 and VBOX1.

Then configure the screen layout:

xrandr --output VBOX1 --auto --right-of VBOX0

dwm will now let you have your own workspaces for each screen.

6. Change your desktop background

Use feh to change your desktop background:

feh --bg-tile /path/to/background/image

7. xterm config

For a usable xterm, create the following ~/.Xresources and run

xrdb -merge .Xresources
xterm*faceName:           monospace:pixelsize=14
xterm*saveLines:          9999
xterm*scrollBar:          false
xterm*background:  #000000
xterm*foreground:  #dfdfdf
xterm*color0:      #000000
xterm*color1:      #9e1828
xterm*color2:      #aece92
xterm*color3:      #968a38
xterm*color4:      #414171
xterm*color5:      #963c59
xterm*color6:      #418179
xterm*color7:      #bebebe
xterm*color8:      #666666
xterm*color9:      #cf6171
xterm*color10:     #c5f779
xterm*color11:     #fff796
xterm*color12:     #4186be
xterm*color13:     #cf9ebe
xterm*color14:     #71bebe
xterm*color15:     #ffffff

8. Add a clock using xsetroot

You can do something like this in a bash script to show the time in dwm on the top right corner.

while true; do
   xsetroot -name "$( date +"%F %R" )"
   sleep 1m    # Update time every minute
done

9. Connect to wifi

This is rather painful. The instructions here were collected from the mailing list, and I did get them to work, but I’m too lazy to write a full tutorial on this right now.

Basically, you need to scan, then do different things depending on whether the wifi uses WEP or WPA for authentication.

Start by running:

iwlist scan

9.1 WEP wifi

> > #wep connect to a wep wifi
> > #! /bin/sh
> >
> > key="`grep $1 /home/pmarin/wep | cut -d' ' -f2`"
> > sudo ifconfig wlan0 up
> > sudo iwconfig wlan0 essid $1
> > sudo iwconfig wlan0 key s:$key
> > sudo dhclient wlan0
> > #end
> > The wep is a plain file with to columms
> >
> > essid  key

9.2 WPA wifi

> >
> > #wpa connect to a wpa wifi
> > #! /bin/sh
> >
> > sudo ifconfig wlan0 up
> > sudo iwconfig wlan0 essid $1
> > sudo wpa_supplicant -iwlan0 -c/home/pmarin/wpa -B
> > sudo dhclient wlan0
> > #end
> >
> > the wpa file is similar than /etc/wpa_supplicant.conf

To create the wpa file:

wpa_passphrase your_ssid_of_network your_network_password
Create the file:
ctrl_interface=/var/run/wpa_supplicant
#ap_scan=2

network={
       ssid="your_ssid"
       scan_ssid=1
       proto=WPA RSN
       key_mgmt=WPA-PSK
       pairwise=CCMP TKIP
       group=CCMP TKIP
       psk=your_psk_from_wpa_passphrase
}
sudo wpa_supplicant -Bw -Dwext -i eth0 -c/etc/wpa_supplicant.conf

9.3 Wifi troubleshooting:

1) CHECK THAT YOU DON’T have the NetworkManager service or wpa_supplicant running already!!!

You can run wpa_supplicant with -dd flag for a detailed debug output.1) If you don’t manage to connect to the AccessPoint, try to uncomment line 2 in /etc/wpa_supplicant.conf.

2) If that doesn’t help, try change its value to 0 or 1.

3) If you get troubles while authenticating, try removing “RSN” and/or”CCMP” strings from /etc/wpa_supplicant.conf.

Sources for Wifi stuff:

http://ubuntuforums.org/showthread.php?t=263136

http://www.mail-archive.com/dwm@suckless.org/msg06800.html


24
Feb 11

Quick tip: Fix Flash audio stutter on Fedora 14 (64bit)

On my FC14 machine, I had a problem with Flash (64bit) audio playback: the sound on sites other than Youtube would stutter terribly. It appears that this a systematic problem; but luckily there is a fix!

Check out Ahmed Abdo’s post on Flash audio stutter for the details. Works perfectly for me!

Details: https://bugzilla.redhat.com/show_bug.cgi?id=638477

The bug is triggered by a change in glibc. Who proposed the fix? Linus Torvalds. So I guess the following isn’t quite true?

I love the pragmatism from his part:

So in the kernel we have a pretty strict “no regressions” rule, and that if people depend on interfaces we exported having side effects that weren’t intentional, we try to fix things so that they still work unless there is a major reason not to.

So I’m disappointed glibc just closes this as NOTABUG. There’s no real reason to do the copy backwards that I can see, so doing it that way is just stupid.

But whatever. You can do a LD_PRELOAD trick to get a sane memcpy(), and it does indeed fix the sound for me.
[...]
The fact that the glibc people don’t do that, and that this hasn’t been elevated despite clearly being a big usability problem (normal users SHOULD NOT HAVE TO google bugzillas and play with LD_PRELOAD to have a working system), is just sad.

Although overall, as an end user it the conversation around this bug and its persistence makes me sad. I know it’s selfish not to care about the technical superiority of a solution or about who is to blame here – but I’d just like to have my smooth Flash playback…


18
Feb 11

HMVC -style cascading file loading in Node.js

One of my favorite features of Kohana 3 is it’s cascading filesystem – so I decided to implement it for Node.js. A cascading filesystem is an elegant solution to a common problem: how to provide a mechanism for loading modules and reusing code?

The following image from Kohana 3′s docs shows an example:

Benefits

The key benefits are:

  1. Consistency. All your application files, including views, controllers, models and other data such as translation messages are loaded using one, easy-to-understand mechanism.
  2. Easy reuse. Without a cascading file system, you’ll have to copy and move files around if you want to use someone else’s libraries or modules. With a cascading file system, you just place the module in your application, and enable cascading for that directory.
  3. Transparent extensibility. What if you want to override one part of a module (say, a view) but don’t want to modify your copy of the module (e.g. so that you can update without manually merging changes). A cascading filesystem allows you to selectively replace files in 3rd party code simply by providing your own version of the file.

The code

Load order and file name resolution

The load order for my implementation is:

  1. Application path –  files under ./application/ are always checked first.
  2. Module paths – set modules(['./modules/my-module']) to enable module loading. Files from modules are loaded from in the order they are added.
  3. System path – files under ./system/ are loaded if no alternative exists.

Assumptions about file and class names

Files are assumed to be lowercase. Underscores in class names are replaced by slashes (so Controller_User becomes ./application/classes/ controller/user.js).

Performance impact

Requests are cached, so that additional calls to find_file() do not cause additional stat() calls in the filesystem. This is insignificant anyway, since Node.js servers are persistent so the cascading search is only done once per server instance for each file (not once per request).

Loading 3rd party code

The loaded files do not need to be “compatible” in any way other than layout in the file system. For example, while Hmvc.factory(‘some_other_lib’) loads the file from ./application/ classes/some/other/lib.js, that file does not actually need to contain a class named some_other_lib; just that it returns something via module.exports.

Methods

The methods are:

  • Hmvc.modules(['./modules/path-to-module']) – set the modules directories to search.
  • Hmvc.find_file(dir, file, ext) – Search each path under dir (e.g. ‘classes’, ‘views’) for file (filename) with the extension (ext, default is “.js”).
  • Hmvc.factory(class_name) – Return a new instance of the given class after loading the corresponding file from the cascading file system. Note that classes should be in the classes subdirectory.
  • Hmvc.load(class_name) – Return whatever require(file-which-contains-the-class) returns. Useful for extending classes, see below for an example.

Example usage:

var Hmvc = require('./hmvc.js');

// test class loading:
// e.g. check ./application/classes/test.js
// ./modules/modulename/classes/test.js
// ./system/classes/test.js
var t = Hmvc.factory('test');
t.run();

// test view loading
// e.g ./application/views/user/index.html
// ./modules/modulename/views/user/index.html
// ./system/views/user/index.html
fs.readFile(Hmvc.find_file('views', 'user/index', '.html'), function (err, data) {
  if (err) throw err;
  sys.puts(data);
});

To set modules:

// set only once, before calling any other functions!
Hmvc.modules([
         "./modules/testmodule/",
         "./modules/testmodule2/",
         ]);

Extending classes:

// test extending class (see code in /application/classes/controller/extend.js
// to see how extension is achieved)
// e.g. ./application/classes/controller/extend.js
// ./modules/modulename/classes/controller/extend.js
// ./system/classes/controller/extend.js
var t3 = Hmvc.factory('Controller_Extend');
t3.run();
t3.run_parent();

Note that if you put hmvc.js in ~/node_modules/hmvc.js, you don’t need to specify the path to hmvc.js… see Modules in node.js docs.

// in extend.js:
var Controller_Extend = function () {
}
// extend the class
var util = require('util'), Hmvc = require('../../../../hmvc.js');
util.inherits(Controller_Extend, Hmvc.load('Controller_Base'));

Controller_Extend.prototype.run = function() {
   console.log("Controller_Extend from testmodule2.");
};
Controller_Extend.prototype.run_parent = function() {
   // run the parent function
   Controller_Extend.super_.prototype.run();
};

module.exports = Controller_Extend;

3
Feb 11

Javascript, node.js and for loops

What does this code print out? Assume that console.log logs to the console.

Experiment #1: For loop

console.log('For loop');
for(var i = 0; i < 5; i++) {
 console.log(i);
}

0, 1, 2, 3, 4 - easy, right? What about this code?

Experiment #2: setTimeout

console.log('setTimeout');
for(var i = 0; i < 5; i++) {
  setTimeout(function() {console.log('st:'+i)}, 0);
}

The result is 5, 5, 5, 5, 5.What about this?

Experiment #3: Callback function

function wrap(callback) {
  callback();
}

console.log('Simple wrap');
for(var i = 0; i < 5; i++) {
  wrap(function() {console.log(i)});
}

0, 1, 2, 3, 4 — right? (Yup.) And this?

Experiment #4: While loop emulating sleep

function sleep(callback) {
  var now = new Date().getTime();
  while(new Date().getTime() < now + 1000) {
   // do nothing
  }
  callback();
}

console.log('Sleep');
for(var i = 0; i < 5; i++) {
  sleep(function() {console.log(i)});
}

0, 1, 2, 3, 4. And this?

Experiment #5: Node.js process.nextTick

console.log('nextTick');
for(var i = 0; i < 5; i++) {
 process.nextTick(function() {console.log('nt:'+i)});
}

Well… it’s 5, 5, 5, 5, 5.

Experiment #6: Delayed calls

var data = [];
for (var i = 0; i < 5; i++) {
  data[i] = function foo() {
    alert(i);
  };
}
data[0](); data[1](); data[2](); data[3](); data[4]();

Again, 5, 5, 5, 5, 5.

Ok, I’m confused. Why does this happen?

Looking at experiments #1 to #6, you can see a pattern emerge: delayed calls, whether they are via setTimeout(), Node.js-specific process.nextTick() or a simple array of functions all print the unexpected result “5″.

Fundamentally, the only thing that matters is at what time the function code is executed. setTimeout() and process.nextTick() ensure that the function is only executed at some later stage. Similarly, assigning functions into an array explicitly like in Experiment #6 means that the code within the function is only executed after the loop has been completed.

There are three things you need to remember about Javascript:

  1. Variable scope is based on the nesting of functions. In other words, the position of the function in the source always determines what variables can be accessed; nested functions can access their parent’s variables, non-nested functions can only access the topmost, global variables.
  2. Functions can create new scopes; the default behavior is to access previous scope.
  3. Some functions have the side-effect of being event-driven and executed later, rather than immediately. You can emulate this yourself by storing but not executing functions, see Experiment #6.

What we would expect, based on experience in other languages, is that in the for loop, calling the function would result in a call-by-value (since we are passing a primitive – an integer) and that function calls would run using a copy of that value at the time when the part of the code was “passed over” (e.g. when the surrounding code was executed). That’s not what happens:

A nested function does not get a copy of the value of the variable — it gets a live reference to the variable itself and can access it at a much later stage. So while the reference to i is valid in both experiment 2, 5, and 6 they refer to the value of i at the time of their execution – which is on the next event loop – which is after the loop has run – which is why they get the value 5.

Functions can create new scopes but they do not have to. The default behavior allows us to refer back to the previous scope (all the way up to the global scope); this is why code executing at a later stage can still access i. Because no variable i exists in the current scope, the i from the parent scope is used; because the parent has already executed, the value of i is 5.

Hence, we can fix the problem by explicitly establishing a new scope every time the loop is executed; then referring back to that new inner scope later.  The only way to do this is to use an (anonymous) function plus explicitly defining a variable in that scope. There are two ways to do this:

Option 1) We can allow the value of i to “leak” from the previous scope, but explicitly establish a new variable j in the new scope to hold that value for future execution of nested functions:

Experiment #7: Closure with new scope establishing a new variable

console.log('new scope nexttick with value binding in new func scope');
for(var i = 0; i < 5; i++) {
 (function() {
  var j = i;
  process.nextTick(function() {console.log('nexttick-new-scope-new-bind:'+j)});
 })();
}

Resulting in 0, 1, 2, 3, 4. Accessing j returns the value of i at the time when the closure was executed – and as you can see, we are immediately executing the function by appending ();

We need to have that wrapping function, because only functions establish new scope. In fact, we are establishing five new scopes when the loop is run, each iteration creating a scope with its own, separate variable j with a different value (0, 1, 2, 3, 4); each accessible from the inner closure at the time the code in it is run. Without the wrapping closure the reference to j in the innermost closure would end up having the same scope as i; it would then have the value of i at the time of the execution; which would be 5.

Options 2: Or we can pass the value to the new scope as a parameter:

Experiment #8: Settimeout in closure with new scope

console.log('new scope');
for(var i = 0; i < 5; i++) {
 (function(i) {
  setTimeout(function() {console.log('st2:'+i)}, 0);
 })(i);
}

Resulting in 0, 1, 2, 3, 4.

Now you should remember one more rule to understand the second solution:

  • Functions can be passed as data; they are only evaluated when explicitly evaluated (e.g. by appending () or by using function.call or function.apply).

So when we have (function(param))(param), we are calling the function immediately and parameters always establish a new variable/identifier in the function scope; that allows us to use the i from the new scope  in our delayed function call – since it is bound to the parameter, not to the parent scope.

This also means that this does NOT work (process.nextTick is interchangeable with setTimeout):

Experiment #9: Closure with new scope containing callback triggered on process.nextTick

console.log('new scope nexttick');
for(var i = 0; i < 5; i++) {
 (function() {
  process.nextTick(function() {console.log('nexttick-new-scope:'+i)});
 })();
}

5, 5, 5, 5, 5 – since i still refers to the old scope. Compare that with experiment #7, where while the inner code is the same, we actually establish a new variable in the wrapping closure’s scope, which is then referred to by the inner code.

Conclusion

I should note that this has nothing do to with synchronicity or asynchronicity; it is simply the way in which scope resolution works for Javascript when code execution is delayed in some manner while referring to variables defined in the parent scope of the nested code.

In Javascript, all functions store “a hierarchical chain of all parent variable objects, which are above the current function context; the chain is saved to the function at its creation”. Because the scope chain is stored at creation, it is static and the relative nesting of functions precisely determines variable scope. When scope resolution occurs during code execution, the value for a particular identifier such as i is searched from:

  1. first from the parameters given to the function (a.k.a. the activation object)
  2. and then from the statically stored chain of scopes (stored as the function’s internal property on creation) from top (e.g. parent) to bottom (e.g. global scope).

Javascript will keep the full set of variables of each of the statically stored chains accessible even after their execution has completed, storing them in what is called a variable object. Since code that executes later will receive the value in the variable object at that later time, variables referring to the parent scope of nested code end up having “unexpected” results unless we create a new scope when the parent is run, copy the value from the parent to a variable in that new scope and refer to the variable in the new scope.

For a much more detailed explanation, please read Dimitry Soshnikov’s detailed account of ECMA-262 which explains these things in full detail; in particular about Scope chains and Evaluation strategies. His explanations of the details are the best I’ve seen anywhere!


2
Feb 11

Essential Node.js patterns and snippets

In this post, I take a look at the different patterns that you need to know when using Node.js. These came from my own coding and from a look at the code behind Tim Caswell’s flow control libraries. I think it is necessary to know how these basic patterns are implemented even if you use a library..

1. Objects and classes

1.1 Class pattern

// Constructor
var Class = function(value1, value2) {
  this.value1 = value1;
}
// properties and methods
Class.prototype = {
  value1: "default_value",
  method: function(argument) {
    this.value2 = argument + 100;
  }
};
// node.js module export
module.exports = Class;
// constructor call
var object = new Class("Hello", "2");

If the class is long, then instead of doing a single Class.prototype = {…} assignment, it may be split into multiple Class.prototype.method = function () {..} assignments.

Reminder: Assign all your properties some value in your constructor. Otherwise while the resulting object can access the property defined in the prototype, the prototype value is shared among all instances. So in order for your “instance” to actually own it’s own copies, you have to explicitly initialize the variables in the constructor, or they will act like static variables in non-prototype-based OOP. It’s a stupid mistake, don’t make it.

1.2 Accessing global values from objects

// constructor
var Class = function(global, value2) {
  this.global = global;
}
// access using this.global in class methods

1.3 Factory pattern

// Constructor
var Class = function(value1, value2) { ... }
// Factory
Class.factory(value1) { return new Class(value1, "aaa"); }
// properties and methods
Class.prototype = { ... };

1.4 Sharing state between modules

Common = {
  util: require('util'),
  fs:   require('fs'),
  path: require('path')
};

module.exports = Common;

// in other modules
var Common = require('./common.js');

1.5 Singleton class (added Feb 2011)

var Singleton = (function() {
   var private_variable = 'value';
   function private_function() {
      ...
   }
   function public_function() {
      ...
   }
  return {
      public_function: public_function
  };
})();

2. Parsing requests

2.1 Parsing GET

// parse URL
var url_parts = url.parse(req.url);
// parse query
var raw = querystring.parse(url_parts.query);
// some juggling e.g. for data from jQuery ajax() calls.
var data = raw ? raw : {};
data = raw.data ? JSON.parse(raw.data) : data;

2.2 Parsing POST

if (req.method == 'POST') {
   var fullBody = '';
   req.on('data', function(chunk) {
   // append the current chunk of data to the fullBody variable
   fullBody += chunk.toString();
   });
   req.on('end', function() {
      // parse the received body data
      var decodedBody = querystring.parse(fullBody);
      console.log(decodedBody);
   }
}

3. Concurrency

3.1 Waiting for async stuff to complete before continuing

E.g. when you need to have all the results from the database before you do something.

var wait = function(callbacks, done) {
   var counter = callbacks.length;
   var next = function() {
      if(--counter == 0) {
         done();
      }
   };
   for(var i = 0; i < callbacks.length; i++) {
      callbacks[i](next);
   }
}

Example usage (if you prefer, imagine that these are three database calls and that you are storing the results in some higher-scope variable in each of them and then using that result in function d):

var a = function (next) {
   setTimeout( function() {
      console.log("Done A");
      next();
   }, 3000);
  };

var b = function (next) {
   setTimeout( function() {
      console.log("Done B");
      next();
   }, 2000);
  };

var c = function (next) {
   setTimeout( function() {
      console.log("Done C");
      next();
   }, 1000);
  };

var d = function () {
   console.log("All done!");
  };

wait([a, b, c], d );

Similar libraries include: Tim Caswell’s Step and Will Conant’s Flow.exec(). This code is simpler so it doesn’t use this to pass the function next(); but rather passes it explicitly. Also it needs an array, instead of accepting an arbitrary number of function arguments. The library functions do better error handling and have more features, so you might want to use them / look at them to improve the code.

3.2 Limiting concurrency

E.g. reading a gazillion files but just running 30 reads at a time not to exhaust the available file handles. You have a list of operations to do, you want to do them all but can’t start/don’t want to have more than max_concurrency number of the operations running simultaneously.

I call this the Pile, but there probably is a better name for it. Put your stuff in the pile, and then run it all, finally call done() when everything is done. Main difference with simple completion counters like Wait() above is that this code limits concurrent execution, which is necessary in some cases (e.g. reading files).

var Pile = function() {
   this.pile = [];
   this.concurrency = 0;
   this.done = null;
   this.max_concurrency = 10;
}
Pile.prototype = {
  add: function(callback) {
   this.pile.push(callback);
  },
  run: function(done, max_concurrency) {
      this.done = done || this.done;
      this.max_concurrency = max_concurrency || this.max_concurrency;
      var target = this.pile.length;
      var that = this;
      var next = function() {
         that.concurrency--;
         (--target == 0 ? that.done() : that.run());
      };
      while(this.concurrency < this.max_concurrency && this.pile.length > 0) {
         this.concurrency++;
         var callback = this.pile.shift();
         callback(next);
      }
   }
};

Example usage (add 20 functions, then run em at concurrency of 5 at a time). Again, imagine that setTimeout an async I/O call.

Note: you have to call next() when you’re done.

pilex = new Pile();

var counter = 0;

for(var i = 0; i < 20; i++) {
   pilex.add( function test(next) {
      var now = new Date().getTime();
      setTimeout( function() {
         counter++;
         console.log(counter +" Hello world");
         next();
      }, 5000);
     }
   );
}
pilex.run(function() {console.log("Done "+counter);}, 5);


3.3 Pooling and reusing expensive, persistent resources

I recommend using node-pool, since the management code is rather involved if you want to timeout/renew objects in the pool.

3.4 Running arbitrary workflows when dependencies are matched

If you can split your overall task into several independent async workflows, then Conductor seems like a nice solution since it does dependency resolving for you.

4. More good basic node.js patterns/snippets?

Leave a comment, write a Gist, write a blog post or send me a link to your repository + explain what it is and when/why it should be used. I want your code, will acknowledge your stuff and will keep periodically updating this page since I want to use it for my own reference/reminder. Thanks!




1
Feb 11

Understanding the node.js event loop

The first basic thesis of node.js is that I/O is expensive:



So the largest waste with current programming technologies comes from waiting for I/O to complete. There are several ways in which one can deal with the performance impact (from Sam Rushing):

  • synchronous: you handle one request at a time, each in turn. pros: simple cons: any one request can hold up all the other requests
  • fork a new process: you start a new process to handle each request. pros: easy cons: does not scale well, hundreds of connections means hundreds of processes. fork() is the Unix programmer’s hammer. Because it’s available, every problem looks like a nail. It’s usually overkill
  • threads: start a new thread to handle each request. pros: easy, and kinder to the kernel than using fork, since threads usually have much less overhead cons: your machine may not have threads, and threaded programming can get very complicated very fast, with worries about controlling access to shared resources.

The second basis thesis is that thread-per-connection is memory-expensive: [e.g. that graph everyone showns about Apache sucking up memory compared to Nginx]

Apache is multithreaded: it spawns a thread per request (or process, it depends on the conf). You can see how that overhead eats up memory as the number of concurrent connections increases and more threads are needed to serve multiple simulataneous clients. Nginx and Node.js are not multithreaded, because threads and processes carry a heavy memory cost. They are single-threaded, but event-based. This eliminates the overhead created by thousands of threads/processes by handling many connections in a single thread.

Node.js keeps a single thread for your code…

It really is a single thread running: you can’t do any parallel code execution; doing a “sleep” for example will block the server for one second:

while(new Date().getTime() < now + 1000) {
   // do nothing
}

So while that code is running, node.js will not respond to any other requests from clients, since it only has one thread for executing your code. Or if you would have some CPU -intensive code, say, for resizing images, that would still block all other requests.

…however, everything runs in parallel except your code

There is no way of making code run in parallel within a single request. However, all I/O is evented and asynchronous, so the following won’t block the server:

c.query(
   'SELECT SLEEP(20);',
   function (err, results, fields) {
     if (err) {
       throw err;
     }
     res.writeHead(200, {'Content-Type': 'text/html'});
     res.end('<html><head><title>Hello</title></head><body><h1>Return from async DB query</h1></body></html>');
     c.end();
    }
);
If you do that in one request, other requests can be processed just fine while the database is running it’s sleep.

Why is this good? When do we go from sync to async/parallel execution?

Having synchronous execution is good, because it simplifies writing code (compared to threads, where concurrency issues have a tendency to result in WTFs).

In node.js, you aren’t supposed to worry about what happens in the backend: just use callbacks when you are doing I/O; and you are guaranteed that your code is never interrupted and that doing I/O will not block other requests without having to incur the costs of thread/process per request (e.g. memory overhead in Apache).

Having asynchronous I/O is good, because I/O is more expensive than most code and we should be doing something better than just waiting for I/O.

An event loop is “an entity that handles and processes external events and converts them into callback invocations”. So I/O calls are the points at which Node.js can switch from one request to another. At an I/O call, your code saves the callback and returns control to the node.js runtime environment. The callback will be called later when the data actually is available.

Of course, on the backend, there are threads and processes for DB access and process execution. However, these are not explicitly exposed to your code, so you can’t worry about them other than by knowing that I/O interactions e.g. with the database, or with other processes will be asynchronous from the perspective of each request since the results from those threads are returned via the event loop to your code. Compared to the Apache model, there are a lot less threads and thread overhead, since threads aren’t needed for each connection; just when you absolutely positively must have something else running in parallel and even then the management is handled by Node.js.

Other than I/O calls, Node.js expects that all requests return quickly; e.g. CPU-intensive work should be split off to another process with which you can interact as with events, or by using an abstraction like WebWorkers. This (obviously) means that you can’t parallelize your code without another thread in the background with which you interact via events. Basically, all objects which emit events (e.g. are instances of EventEmitter) support asynchronous evented interaction and you can interact with blocking code in this manner e.g. using files, sockets or child processes all of which are EventEmitters in Node.js. Multicore can be done using this approach; see also: node-http-proxy.

Internal implementation

Internally, node.js relies on libev to provide the event loop, which is supplemented by libeio which uses pooled threads to provide asynchronous I/O. To learn even more,  have a look at the libev documentation.

So how do we do async in Node.js?

Tim Caswell describes the patterns in his excellent presentation:

  • First-class functions. E.g. we pass around functions as data, shuffle them around and execute them when needed.
  • Function composition. Also known as having anonymous functions or closures that are executed after something happens in the evented I/O.
  • Callback counters. For evented callbacks, you cannot guarantee that I/O events are generated in any particular order. So if you need multiple queries to complete, usually you just keep count of any parallel I/O operations, and check that all the necessary operations have completed when you absolutely must wait for the result; e.g by counting the number of returned DB queries in the event callback and only going further when you have all the data. The queries will run in parallel provided that the I/O library supports this (e.g. via connection pooling).
  • Event loops. As mentioned earlier, you can wrap blocking code into an evented abstraction e.g. by running a child process and returning data as it it is processed.

It really is that simple!