ES6 generators and async/await

Motivation for this blogpost was my initial ignorance regarding the upcoming ES6 and the generator function. Around christmas 2015 I was looking for a new job, and during one of the assigned test-jobs I had to use generator functions, the .co library, two different databases (one SQL and the other NoSQL)  and node.js. I failed, it was too much of new things to learn at once. But it planted the idea to really learn it and prove that I’ve mastered it. Hence this blog post 🙂 But there are some many new things coming to javascript land, that I still feel pretty much like a beginner 🙂

What really confused me, was the yield expression and the injection of variables. So, now that I understand it, I want to explain it 🙂

I have also created a git repository with sample code. Feel free to fork it and experiment.

So what is a generator function and why to use it?

In short: it is a function which execution can be paused at the yield expression. When the invokend function yields, it returns the control of execution to the UI thread, so it does not block. It can also act as a kind of “dependency-injection” (more on this later). The other advantage is that it allows to write asynchronous code which looks like synchronous. The obvious advantage is that it avoids the callback hell, nested callback etc. The code is much cleaner and easier to read and reason about.

Under the hood

The generator iterator is implemented as a state-machine provided either by JS runtime, or if not then by the Regenerator . It is a transpiler from ES6 to today’s javascript, so you can write future code today. If you are from the C# world you might know that the async/await is implemented in IL pretty much the same way as when using Regenerator. It also uses state machine, and is explained in depth in this Pluralsight course.

Explaining how state machines work is beyond the scope of this article (and I do not know enough about them yet).

Pausing a generator

As I’ve already said, generator is a pauseable function. But it can not be paused from the outside, generator pauses itself when it reaches the yield statement. But once it is paused, it can not resume itself (obviously), it needs to be resumed from the outside. More importantly, resuming a generator is more than a way how to control execution flow. You can pass in a new variable, which will effectively replace the old value on which it was paused (More on that later. It sounds complicated, but don’t worry).

Mental Acrobatics

Whn I first saw some code using generators, it was something like this. My head was exploding from this craziness. I had to read it so many times to actually understand it.

// Mental acrobatic (via David Walsh's blog)
function *foo(x) {
    var y = 2 * (yield (x + 1));
    var z = yield (y / 3);
    return (x + y + z);
}

var it = foo( 5 );

// note: not sending anything into `next()` here
console.log( it.next() );       // { value:6, done:false }
console.log( it.next( 12 ) );   // { value:8, done:false }
console.log( it.next( 13 ) );   // { value:42, done:true }

Basic example

"use strict";

function* myGenerator() {
    yield 1;
    yield 2;
    yield 3;
}

// create instance of an iterator
const it = myGenerator();

console.log(it.next()); // outputs {value: 1, done: false}
console.log(it.next()); // outputs {value: 2, done: false}
console.log(it.next()); // outputs {value: 3, done: false}
console.log(it.next()); // outputs {value: undefined, done: true}

As you can see, firstly you have to create in iterator instance. Iterator is probably yet another new concept to learn. Iterator is the concept behind the many ES6 built-in features, like for example the …spread operator, the for…of loop, Map, Set, WeakSet, etc. Iterator is any object which implements the method .next(). When you call next(), it will return the next value and move the pointer to another value. It then waits for another call of next(), so it can return another value.

But calling next() manually is tedious, especially when you do not know in advance how many times you will have to call next() until you reach the end. You probably want some helper function which will automate it for you. I am using 2 helper functions for this, run() which is used for running generators only and spawn() which handles promises. It will all be explained, don’t worry.

Side note: When you instantiate the generator iterator, you do not use the keyword new(). Explanation is here. You invoke it as a normal function.

Running the examples in the browser

The easiest way how to run the examples is to copy-paste the code in regenerator and click the Run button 🙂  Or you can download the code samples repository, run npm install and then play with it.

What .next() does?

Calling a generator function does not execute its body immediately; an iterator object for the function is returned instead. When the iterator’s next() method is called, the generator function’s body is executed until the first yield expression, which specifies the value to be returned from the iterator or, with yield*, delegates to another generator function. The next() method returns an object with a value property containing the yielded value and a done property which indicates whether the generator has yielded its last value as a boolean. Calling the next() method with an argument will resume the generator function execution, replacing the yield expression where execution was paused with the argument from next().

(Quotation from MDN)

So, the number of next() calls is always +1 to the number of yields. The first next() is the “extraneous one” which starts the iterator.

Before, there were two functions:next and send . (From MDN – Section “Legacy generators objects“)
* next can not receive any arguments , and is used to resume and iterate a generator object.
* send can receive one argument which will be treated as the result of the last yielded expression. Send can not be used with a newborn generator , in other word we can only call send after we have called one or more next.

Later, the send() function was eliminated and replaced by an argument to next(). I hope that makes it a bit clearer. 

What .throw() does?

This lesser known method can be used (surprise) to throw errors into the iterator objects. When you throw error inside the iterator, it will resume execution at the last-invoked yield, and the error will be thrown there. In other words: The error will occur at the exact place where the generator was paused.

function *foo() {
    try {
        var x = yield 3;
        console.log( "x: " + x ); // may never get here!
    }
    catch (err) {
        console.log( "Error was caught: " + err );
    }
}

var it = foo();
var res = it.next(); // { value:3, done:false }

// instead of resuming normally with another `next(..)` call,
// let's throw a wrench (an error) into the gears:
it.throw( "Oops!" ); // Error was caught: Oops!

Yield and yeild *

The yield keyword is in fact an expression. Which means that its value is computed in runtime. You can think of a yield statement as “pause execution and wait for a new value“. You can assign yield to a variable, and when the generator will get restarted, the variable will have that new value.

"use strict";

// Example of resuming yielded generator and passing in a new variable

function *generator() {
    console.log('I have started');
    yield; // we yield nothing, so the returned value is undefined

    var a = yield "some value";
    console.log('a is now ', a); // 'a' will contain value passed into generator when it was resumed
}

(function(){
    var iterator = generator();
    var firstValue = iterator.next(); 
    // outputs {done: false, value: undefined}
    console.log('firstValue', firstValue); 

    var secondValue = iterator.next();
    // outputs {done: false, value: "some value"}
    console.log('secondValue', secondValue); 

    // resumes the generator, 'a' will now get value from the yield expression
    // and outputs it to console
    var thirdValue = iterator.next('bbb');
    // outputs {done: true, value: undefined}
    console.log('thirdValue', thirdValue);
})();

The difference between yield and yield *

The yeild * means yeild to a another iterator. It effectively means that the current iterator will pause its own iteration and another iterator will start. The caller can not notice the difference. It is analogous to a function which calls another function, which may call yet another function. The same works with generators, but you have to use the yield* keyword. The calling iterator can receive values from the called iterator via return statement (not via yield!). It is the only proper usage of the return keyword inside of a generator. See the example.

"use strict";

var spawn = require('./04-spawn-function');

// Example of yeilding to another generator

function* caller() {
    yield 1;
    yield 2;

    // no need to manually call .next() etc
    const a = yield* callee();
    console.log(`variable a from callee() is ${a}`);

    yield 5;
}

function* callee() {
    yield 3;
    yield 4;
    return 'A from Callee';
}

// Each generator instance is also an iterator, so we can run for...of loop
for (var v of caller()){
    console.log(v);
}

// Outputs this into console.log
// 1
// 2
// 3
// 4
// variable a from callee() is A from Callee
// 5


spawn(caller);
// Outputs this into console.log
// variable a from callee() is A from Callee

Error handling

One of the advantages of using generators with async code is that is simplifies error handling. You can use plain old try…catch statement and it will work. Try to do that with nested callbacks 🙂

// Error is thrown inside the generator
// but caught outside of it
function *errorFoo(){
    throw new Error('error was thrown inside the generator');
}
const it2 = errorFoo();
try {
    it2.next();
}
catch(err) {
    console.log('Something went wrong inside our generator', err);
}

More examples can be found in my repository.

Making use of it all

As I’ve already said, the big benefit of using generators is that is allows to write asynchronous code which looks synchronous. Something like this.

// The generator is wrapped inside run() function
run(function*(){
  const articleId = '1234a';
  
  try {
    const article = yield loadArticle(articleId);
    // After article has been loaded, load comments
    // Look ma'am, no callbacks 
    const comments = yield loadComments(articleId);
    // When comments are loaded, load article's author
    // again no callbacks 
    const author = yield loadAuthor(article.authorId);
    
    // in real life you would probably run all 3 request in parallel, lol
  }
  catch (err) {
    console.log('error loading', err);
    alert('Error loading article');
  }
});

Whole code is here

You may have noticed the run() function wrapping our generator*. That is important. Its job is to run the generator function until all yeilds are exhausted. The full source code is on Gitlab.

"use strict";
function run(/*iterator fn*/ generatorFn) {
 var iterator = generatorFn(); // [1]
 next(); // [2]

 function next (err, value) { // [3]
 if (err) {
  return iterator.throw(err);
 }

 var continuable = iterator.next(value); 

 if (continuable.done) return; // [4]

 var callbackFn = continuable.value; // [5]
 callbackFn(next);
 }
};

Injection of variables

You can think of the yield expression as a two-way communication channel. Generator yields value out, and waits for another value supplied via next. See the example.

"use strict";

function* generator(){
    // run until first yield
    const x = 1;
    const y = yield x;

    // when iterator is resumed, then 'y' will have value supplied as argument to next()
    console.log(`y is now ${y}`);

    // yield out value of x + y 
    // and wait for new value to be injected into variable z
    const z = yield (x + y);
    console.log(`z is ${z}`);

    yield (x + y + z);
}

(function(){
    const iterator = generator();
    //
    // 1) Start the iteration
    //
    // run until first yield
    console.log('iterator started');
    var status = iterator.next();

    // outputs {value: 1, done: false}
    console.log('status is', status);

    //
    // 2nd step
    //
    // send in value for 'y'
    console.log('sending in value of y = 10');
    status = iterator.next(10);

    // outputs {value: 11, done: false}
    console.log('status is', status); 

    //
    // 3rd step
    //
    // send in value for 'z'
    // this will console.log the message 'y is now 11'
    console.log('sending in value of z = 100');
    status = iterator.next(100);

    // outputs {value: 111, done: false}
    console.log('status is', status);

    //
    // 4th step
    //
    // we are now at the 'yield (x + y + z)'
    // there is no assignment to variable,
    // so the injected value will be simply ignored  
    status = iterator.next(1000);

    // we are done now
    // outputs {value: undefined, done: false}
    console.log('status is', status); 

})();

 Async/Await

The upcoming ES7 has a proposal of async/await (which is likely to be accepted) which is like a generator function wrapped inside the run() function. The async function is is actually nice syntax sugar – see the source code here. The async keyword will replace the wrapping run() function, and the await will replace waiting for yields and variables injection. So the code will be much simpler and easier to reason about. Nice.

The code using async/await can look something like this

(async() => {
  try {
    // url to randomly generated json
    const url = 'http://www.json-generator.com/api/json/get/cozIoacyaa?indent=2';

    // fetch is a new API, better than XMLHttpRequest
    // it returns a promise, so it can be awaited
    var response = await fetch(url);
    var data = await response.json();
    console.log('data arrived', data);
  } catch (err) {
    console.log("Booo", err);
  }
})();

Performance

It may not be obvious on the first sight, but generators still run in the same thread as normal javascript. And they require additional resources, so they are actually less performant. Their real value is the ability to pause execution, implement lazy evaluation, infinite sequences and elegant dealing with asynchrony.

Study materials

Some of the code samples were actually taken from resources below

Code samples – https://gitlab.com/DavidVotrubec/learn-es6-generators/
Iterator object – https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols#iterator
MDN article on generators – https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Statements/function*
Introduction to ES6 generators – https://davidwalsh.name/es6-generators – read also the comments below the article, they are explanatory. Actually the whole series is great.
Diving deeper with ES6 generators – https://davidwalsh.name/es6-generators-dive
ES6 tools – https://github.com/addyosmani/es6-tools
How generators work – http://x-team.com/2015/04/generators-work/
Common misconceptions about generators – https://strongloop.com/strongblog/how-to-generators-node-js-yield-use-cases/
Javascript event loop and call stack explained – https://vimeo.com/96425312 – great video, you really want to see it
Javascript and state machines – http://www.skorks.com/2011/09/why-developers-never-use-state-machines/
Async/Await Proposal – https://tc39.github.io/ecmascript-asyncawait/
Sample codes of async/await http://rossboucher.com/await/#/
Ember-Concurrency – http://ember-concurrency.com/#/docs

How does a generator actually look like? Like this, yeah!

generator