Javascript, node.js and for loops

What does this code print out? Assume that console.log logs to the console.

Experiment #1: For loop

console.log('For loop');
for(var i = 0; i &lt; 5; i++) {
 console.log(i);
}

0, 1, 2, 3, 4 - easy, right? What about this code?

Experiment #2: setTimeout

console.log('setTimeout');
for(var i = 0; i &lt; 5; i++) {
  setTimeout(function() {console.log('st:'+i)}, 0);
}

The result is 5, 5, 5, 5, 5.What about this?

Experiment #3: Callback function

function wrap(callback) {
  callback();
}

console.log('Simple wrap');
for(var i = 0; i &lt; 5; i++) {
  wrap(function() {console.log(i)});
}

0, 1, 2, 3, 4 -- right? (Yup.) And this?

Experiment #4: While loop emulating sleep

function sleep(callback) {
  var now = new Date().getTime();
  while(new Date().getTime() &lt; now + 1000) {
   // do nothing
  }
  callback();
}

console.log('Sleep');
for(var i = 0; i &lt; 5; i++) {
  sleep(function() {console.log(i)});
}

0, 1, 2, 3, 4. And this?

Experiment #5: Node.js process.nextTick

console.log('nextTick');
for(var i = 0; i &lt; 5; i++) {
 process.nextTick(function() {console.log('nt:'+i)});
}

Well... it's 5, 5, 5, 5, 5.

Experiment #6: Delayed calls

var data = [];
for (var i = 0; i &lt; 5; i++) {
  data[i] = function foo() {
    alert(i);
  };
}
data0; data1; data2; data3; data4;

Again, 5, 5, 5, 5, 5.

Ok, I'm confused. Why does this happen?

Looking at experiments #1 to #6, you can see a pattern emerge: delayed calls, whether they are via setTimeout(), Node.js-specific process.nextTick() or a simple array of functions all print the unexpected result "5".

Fundamentally, the only thing that matters is at what time the function code is executed. setTimeout() and process.nextTick() ensure that the function is only executed at some later stage. Similarly, assigning functions into an array explicitly like in Experiment #6 means that the code within the function is only executed after the loop has been completed.

There are three things you need to remember about Javascript:

Variable scope is based on the nesting of functions. In other words, the position of the function in the source always determines what variables can be accessed; nested functions can access their parent's variables, non-nested functions can only access the topmost, global variables.
Functions can create new scopes; the default behavior is to access previous scope.
Some functions have the side-effect of being event-driven and executed later, rather than immediately. You can emulate this yourself by storing but not executing functions, see Experiment #6.

What we would expect, based on experience in other languages, is that in the for loop, calling the function would result in a call-by-value (since we are passing a primitive - an integer) and that function calls would run using a copy of that value at the time when the part of the code was "passed over" (e.g. when the surrounding code was executed). That's not what happens:

A nested function does not get a copy of the value of the variable -- it gets a live reference to the variable itself and can access it at a much later stage. So while the reference to i is valid in both experiment 2, 5, and 6 they refer to the value of i at the time of their execution - which is on the next event loop - which is after the loop has run - which is why they get the value 5.

Functions can create new scopes but they do not have to. The default behavior allows us to refer back to the previous scope (all the way up to the global scope); this is why code executing at a later stage can still access i. Because no variable i exists in the current scope, the i from the parent scope is used; because the parent has already executed, the value of i is 5.

Hence, we can fix the problem by explicitly establishing a new scope every time the loop is executed; then referring back to that new inner scope later. The only way to do this is to use an (anonymous) function plus explicitly defining a variable in that scope. There are two ways to do this:

Option 1) We can allow the value of i to "leak" from the previous scope, but explicitly establish a new variable j in the new scope to hold that value for future execution of nested functions:

Experiment #7: Closure with new scope establishing a new variable

console.log('new scope nexttick with value binding in new func scope');
for(var i = 0; i &lt; 5; i++) {
 (function() {
  var j = i;
  process.nextTick(function() {console.log('nexttick-new-scope-new-bind:'+j)});
 })();
}

Resulting in 0, 1, 2, 3, 4. Accessing j returns the value of i at the time when the closure was executed - and as you can see, we are immediately executing the function by appending ();

We need to have that wrapping function, because only functions establish new scope. In fact, we are establishing five new scopes when the loop is run, each iteration creating a scope with its own, separate variable j with a different value (0, 1, 2, 3, 4); each accessible from the inner closure at the time the code in it is run. Without the wrapping closure the reference to j in the innermost closure would end up having the same scope as i; it would then have the value of i at the time of the execution; which would be 5.

Options 2: Or we can pass the value to the new scope as a parameter:

Experiment #8: Settimeout in closure with new scope

console.log('new scope');
for(var i = 0; i &lt; 5; i++) {
 (function(i) {
  setTimeout(function() {console.log('st2:'+i)}, 0);
 })(i);
}

Resulting in 0, 1, 2, 3, 4.

Now you should remember one more rule to understand the second solution:

Functions can be passed as data; they are only evaluated when explicitly evaluated (e.g. by appending () or by using function.call or function.apply).

So when we have (function(param))(param), we are calling the function immediately and parameters always establish a new variable/identifier in the function scope; that allows us to use the i from the new scope in our delayed function call - since it is bound to the parameter, not to the parent scope.

This also means that this does NOT work (process.nextTick is interchangeable with setTimeout):

Experiment #9: Closure with new scope containing callback triggered on process.nextTick

console.log('new scope nexttick');
for(var i = 0; i &lt; 5; i++) {
 (function() {
  process.nextTick(function() {console.log('nexttick-new-scope:'+i)});
 })();
}

5, 5, 5, 5, 5 - since i still refers to the old scope. Compare that with experiment #7, where while the inner code is the same, we actually establish a new variable in the wrapping closure's scope, which is then referred to by the inner code.

Conclusion

I should note that this has nothing do to with synchronicity or asynchronicity; it is simply the way in which scope resolution works for Javascript when code execution is delayed in some manner while referring to variables defined in the parent scope of the nested code.

Comments

Gabriel Farrell: Great post, with a clear explanation I'm sure I'll refer to whenever people (myself included) get tripped up by this feature of JS.

One question: is it really true to say "this has nothing do to with synchronicity or asynchronicity"? Could this happen without asynchronous execution?

Mikito Takada: Good question. I think it is valid to say that it has nothing to do with asynchronicity as such because this is a feature of Javascript's scope handling and evaluation strategy - what Dimitry calls "Call by sharing" (e.g. scope can refer to variables which have been influenced by previous evaluations and primitives are not passed by copying but rather via "sharing").

For instance, if variables were passed "by copying" (when a structure is passed by value to a function it is completely copied), then this problem would not exist even if callbacks could be stored for later execution since every function call in the for loop would get it's own value (the value of the variable at call time not at the time it would be run. So you could have a Javascript-like language that would avoid this problem - however, in most cases we want things to work exactly like Javascript does them; it's just the for loops where we would like the value of the variable to be copied at call time.

Having this kind of scope resolution/evaluation strategy makes implementing asynchronous operations easy... but I would say that it is not essential to writing asynchronous code, it just makes writing asynchronous code a lot easier.

So the technical root cause is the scope resolution/evaluation strategy. Asynchrony refers to the ability of events to occur independently of the main program flow; i.e. it is about interacting with event sources like file I/O, network I/O or user actions. Here, the fact that the code can be executed later is not the reason why the problem occurs since given alternative scope resolution/evaluation strategies this problem would not occur.

To answer your question: "Could this happen without asynchronous execution?" the answer is no, but if you ask "Could asynchronous execution be possible without this happening?" the answer is yes.

It's all a bit academic, but I think it's best to understand why the problem occurs rather than just be satisfied with saying that it is a side-effect of asynchronous execution...

Chris Jacob: THANK YOU! I was going insane trying to figure out why I kept getting the last iteration value ... I went with Experiment #7.

I tried Experiment #8 but modified as: (function(j) { ... test here without setTimeout ... }(i); It didn't work - execution wasn't in "order". Expect I need a delayed function call like setTimeout to wrap it as you said. (FYI: my output was contained inside a Facebook API call - so network delay may also be the cause of the sequencing issue).

Chris Jacob: Hi Mikito,

I have referenced your article for this StackOverflow issue:

"Saving FB.api call's response to an array with a for loop with Javascript." http://stackoverflow.com/questions/5971124/saving-fb-api-calls-response-to-an-array-with-a-for-loop-with-javascript/6195695

My answer is here: http://stackoverflow.com/questions/5971124/saving-fb-api-calls-response-to-an-array-with-a-for-loop-with-javascript/6195695#6195695

Could you please accept my answer if it's correct... or comment if I got it wrong. Cheers! ^_^

Josh: Another strategy you might mention is from underscore -- http://documentcloud.github.com/underscore/#bind

for(var i =0;i<5;i++){ process.nextTick(_.bind(function(index){ console.log(index); }, this, i)); }

This is used to both bind the 'this' parameter, and to optionally pass additional parameters (technique called currying).

Mikito Takada: Yeah, that works too, though you don't really need underscore.js to make this work. If you look at the code in underscore.js, it returns a new function which means that a new scope is established.

// Create a function bound to a given object (assigning this, and arguments, // optionally). Binding with arguments is also known as curry. // Delegates to ECMAScript 5's native Function.bind if available. // We check for func.bind first, to fail fast when func is undefined. _.bind = function(func, obj) { if (func.bind === nativeBind && nativeBind) return nativeBind.apply(func, slice.call(arguments, 1)); var args = slice.call(arguments, 2); return function() { return func.apply(obj, args.concat(slice.call(arguments))); }; };

Josh: I agree, you don't need underscore here. To keep the code more vanilla, I'd commonly use your 'Experiment #8' approach. The benefit of _.bind or Function.bind is that you get the benefit of also controlling what 'this' refers to, which is a common bummer in implementing the class pattern in js. Of course, these libraries are just returning a function that calls .apply(this) with some variant of argument slicing. Since I'm usually depending on underscore, its all there in a pretty package :: shrug ::

Happy coding (and good article)!

John Quaresma: Thanks for posting this.

It's an interesting way to frame the lexical closure problem, which is a very common javascript test given to potential job candidates. I share Mikito's concern that the overall emphasis on execution order rather than scoping and closures may mislead some naive readers. Specifically, the reason this 'problem' exists is that all inner functions will refer to the address and not the value of the variables that exist in their containing scope / closure. The only way to break this is to create a new closure, and by extension a new variable scope / address space. You've definitely addressed that here, but since it's not framed as being the main point of the post, it may be lost on some.

Andrea Giammarchi: and suddenly, you discover this problem does not exists with timers: for(var i = 0; i < 5; ++i) { setTimeout(function(i){console.log(i)}, 1, i); } // 0, 1, 2, 3, 4

setTimeout, and setInterval, accepts everywhere but IE extra arguments that are passed to the function with the value bound at definition time.

It looks that who ever created nextTick in node, didn't know enough about JavaScript timers.

In specs even in W3C http://www.w3.org/TR/html5/timers.html

Ash Clark: Thanks this was just what I needed!

Ash Clarke: Thanks Mikito, this was just what I needed!