Series: JavaScript Iterators and Generators

Introduction Link to heading

Here we are! It’s time for Asynchronous Generators! Are you excited? I am excited 😃.

This kinda exotic type of function was added to the language at the same time as async iterators were, and it answers the question: how async functions and generators should compose? If you think about it, async generators are still generators, so there shouldn’t be a lot of both conceptual and concrete (in their usage) differences with the sync counterpart. But async generators are async functions too, so the await keywords can be used inside them. That’s why the previous question can be reworded as follows: how can the yield keyword work in conjunction with the await keyword?

Asynchronous Generators Link to heading

The answer involves state machines and queues, plus a clever algorithm that is damn well explained here. Trying to summarize and simplify, each time then next/throw/return method is called, a Promise is immediately returned and the call itself is enqueued. The async generator instance can be in two states: PAUSED or RUNNING. The PAUSED state means that the instance was waiting to start or was paused on a yield. In such a case, the call just enqueued will resume the generator and the next yielded value, or the next thrown error, will be used to fulfil, or reject, the Promise previously returned from that call. But the generator could be in the RUNNING state, and this means that it is awaiting some async operation. Therefore it is still handling a previous iteration’s method call and the new one mustn’t have any side effect on the generator instance’s state. It should just be queued. The circle is closed because each time a yield keyword is encountered, if there is at least one enqueued call, the generator won’t PAUSE. The oldest enqueued call will be dequeued to be handled and the generator will continue with its flow.

Although it should be clear enough at this point, I want to stress that objects returned by async generators implement the AsyncIterator interface we saw in the last article. Therefore they are perfect to create async iterables. Moreover, the implicit iteration’s methods calls queue contrasting a possible high consumer pressure, the fact that all async iterators returned are async iterables as well (remember the return this convention) and, last but not least, the impossibility of returning a Promise as an iteration result’s value, following the spec hints, are further excellent reasons for preferring async generators over a manual implementation of the async iteration interfaces.

Our first async generator Link to heading

Let’s revisit the remotePostsAsyncIteratorsFactory we built in the previous article. The code is self-explanatory enough:

async function* remotePostsAsyncGenerator() {
    let i = 1;
    
    while(true) {

        const res = await fetch(`https://jsonplaceholder.typicode.com/posts/${i++}`)
                                .then(r => r.json());
        
        // when no more remote posts will be available,
        // it will break the infinite loop.
        // the async iteration will end
        if (Object.keys(res).length === 0)  { break; }

        yield res;
    }
}

The act of yielding something we do not yet have could sound weird to you, but remember that a temporally ordered set of Promises is involved, which will be fulfilled only when data become available. Surely you will have noticed the qualitative leap respect to the previous manual implementation of the async iteration interfaces. That’s because although async generators seem scary and intimidating at first, they help us to write some very clear, compact and effective code, despite the considerable level of abstraction.

yield-on, await-off Link to heading

Let’s become more confident with the behaviour of these two keywords.

As we did before, you can wait for the completion of a Promise, to then yield its fulfilment value:

const asyncSource = {
    async *[Symbol.asyncIterator]() {
        const v = await new Promise(res => setTimeout(res, 1000, 1));
        yield v;
    }
}

You can do the same with only one line, like a true __pro__grammer:

const asyncSource = {
    async *[Symbol.asyncIterator]() {
        yield await new Promise(res => setTimeout(res, 1000, 1));
    }
}

In this case, you can avoid using the await keyword at all! That’s because, when it comes to async iteration, an IteratorResult’s value should never be a Promise. Therefore, async generators won’t let the Promise pass. Each yielded Promise will be implicitly awaited:

const asyncSource = {
    async *[Symbol.asyncIterator]() {
        yield new Promise(res => setTimeout(res, 1000, 1));
    }
}

Otherwise, if you plan to insert a Promise into the generator, using the next method, you will have to manually await it:

const asyncSource = {
    async *[Symbol.asyncIterator]() {
        // the good, old operators precedence...
        const nextArgument = await (yield ...);
    }
}

The next script is a good test to know if you have understood how await and yield work together:

const asyncSource = {
    async *[Symbol.asyncIterator]() {
            console.log(await new Promise(res => setTimeout(res, 1000, 1)));
            console.log(await new Promise(res => setTimeout(res, 2000, 2)));
            yield 42;
    }
}

const ait = asyncSource[Symbol.asyncIterator]();

ait.next().then(({value}) => console.log(value));

What will it log?

// 1
// 2
// 42

You could be tempted to think that, since the 42 is already known, it we’ll be almost immediately returned. But this is not how async generators works: a Promise is instantly returned by the next method, but its resolution is deferred until the first yield is encountered.

Every other consideration we made about sync generators, like the behaviour of the try-catch block, generators delegation and so on, also apply to the async variant. Therefore, I chose to follow a DRY approach. Just keep in mind that async generators work in the async realm, so fulfilled or rejected Promises are used instead of spatial values and exceptions. Be aware that the yield* keyword does support both async and sync iterables. It uses a conversion process similar to the one adopted by the for-await-of.

A more complex real-life example Link to heading

The following is a possible approach for the case where a remote resource has a very strict limit on the number of requests, so we have to fetch more data with one request but we want to handle them in smaller chunks:

// do you remember it?
function* chunkify(array, n) {
    yield array.slice(0, n);
    array.length > n && (yield* chunkify(array.slice(n), n));
}

async function* getRemoteData() {
    let hasMore = true;
    let page;

    while (hasMore) {
        const { next_page, results } =
            await fetch(URL, { params: { page } }).then(r => r.json());

        // return 5 elements with each iteration
        yield* chunkify(results, 5);

        hasMore = next_page != null;
        page = next_page;
    }
}

At each iteration, getRemoteData will return one chunk of the previously fetched array, thanks to the delegation to chunkify. Sometimes the generator will fetch another bunch of data from the remote source, sometimes it will not. There will be no differences from a consumer’s point of view.

It’s noteworthy that each IteratorResult is strictly dependent on the previous ones. Each iteration could be the last: local to an array of results or global. Furthermore, to get the next bunch of data, we need the previous next_page. It would be impossible to maintain consistency without the queuing logic of the async generators.

Generators Piping Link to heading

There is a powerful pattern, conceptually similar to the Unix piping, based on the fact that each generator produces an iterator-iterable that can be iterated.

The overall idea is to combine multiple generators sequentially, like a pipe combine multiple processes, feeding each generator with the values yielded by the one which precedes it. Because each generator is able to do whatever it wants with those values, many pretty interesting things can be done, both sync and async 😃. From simply logging them for debugging purposes, to completely transforming them. From yielding out only those that satisfy a given condition, to yield a composition of them. Two compound generators are not required to yield at the same frequency. Therefore the outer is able to iterate over the inner more than once before yielding something. But it is equally capable of producing more than one value using only one chunk of data received from its source.

Synchronous piping Link to heading

To better understand the idea, let’s start doing something easy and sync. Let’s say we have a collection of numbers and we want to multiply each one by 2, to then subtract 1 and, finally, remove all non-multiples of 3:

function* multiplyByTwo(source) {
    for(const value of source) {
        yield value * 2;
    }
}

function* minusOne(source) {
    for(const value of source) {
        yield value - 1;
    }
}

function* isMultipleOfThree(source) {
    for(const value of source) {
        !(value%3) && (yield value);
    }
}

const collection = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];

// here the piping
[...isMultipleOfThree(minusOne(multiplyByTwo(collection)))]; // [3, 9, 15]

There are some considerations to make. The first concerns the high reusability, maintainability and extensibility of this approach. Each generator expects an iterable, that is an interface, so it knows nothing about its source. Furthermore, it produces another iterable, so it doesn’t need to know anything about its consumer. You can compose these functions as you want, reusing them whenever you need. The second is about memory efficiency: unlike arrays’ methods such as map, flatMap and filter, there isn’t a collection that is recreated from scratch at each step of the pipeline. Each value flows from the source to the end, or to the generator that will discard it. This is a remarkable resemblance to transducers.

Asynchronous piping Link to heading

This is also applicable to the async realm thanks to async generators. Let’s consider a fairly widespread example, that is reading a file using Node.js streams, which are async iterables:

// yaffee
(async function IIAFE() {
    const fileStream = new FileReaderByLines('test.txt');

    for await (const line of fileStream) {
        console.log('> ' + line);
    }
})();

We are creating an instance of the FileReaderByLines class, that we will examine shortly, to then asynchronously iterating over it thanks to the for-await-of loop.

Here it is the FileReaderByLines class:

const fs = require('fs');

// transform chunks into lines
async function* chunksToLines(chunksSource) {
    
    let previous = '';

    for await (const chunk of chunksSource) {
        previous += chunk;
        let eolIndex;

        while ((eolIndex = previous.indexOf('\n')) >= 0) {
            const line = previous.slice(0, eolIndex + 1);

            yield line;
            
            previous = previous.slice(eolIndex + 1);
        }
    }

    if (previous.length > 0) {
        yield previous;
    }
}

class FileReaderByLines {
    constructor(file) {
        this.file = file;
    }
        
    [Symbol.asyncIterator]() {
        const readStream = fs.createReadStream(this.file, {
                encoding: 'utf8',
                highWaterMark: 1024
        });

        // PIPING
        // readStream produces an async iterable over chunks of the file
        // we feed 'chunksToLines' with it
        // 'chunksToLines' produces an async iterable that we return
        return chunksToLines(readStream); 
    }          
}

Let me explain what is happening.

The @@asyncIterator method takes care of handling the data source, the file, using a readable stream. The call to fs.createReadStream method creates the stream by setting the file to read, the encoding used by the file and the maximum size of an internal buffer. The dimension of each chunk of data returned by the stream will be at most 1024 bytes. The stream, when iterated, will return the content of the asynchronously filled buffer when it becomes available.

If the following for-await-of is used, the outcome would be the printing of the whole file, 1024 bytes at a time:

for await (const buffer of readStream) {
    console.log(buffer);
}

Instead, we are using the piping pattern to transform one, or more if necessary, 1024 bytes long chunk into lines. We compose the async iterable returned by fs.createReadStream with chunksToLines, an async generator that takes an async iterable and returns another one. The latter, at each async iteration, will return a line, not a chunk. It’s noteworthy that most of the time the two async iterables produce values at a different frequency.

At the IIAFE level we could choose, for some weird reason, to capitalize each line. We can easily do that by inserting another async generator into the pipeline:

async function* capitalize(asyncSource) {
    for await (const string of asyncSource) {
        yield string[0].toUpperCase() + string.slice(1);
    }   
}

// yaffee
(async function IIAFE() {
    const fileStream = new FileReaderByLines('test.txt');

    for await (const line of capitalize(fileStream)) { // <-- here the PIPING
        console.log('> ' + line);
    }
})();

In this way, we are composing the async iterable fileStream with capitalize, another async generator that takes an async iterable and returns a fresh new one. The latter, at each async iteration, will return capitalized lines.

Conclusion Link to heading

That’s all folks!

We have learnt why we need Asynchronous Generators, how they work and how they can be used to build amazing things! Thanks to them, creating well-formed async iterables is a breeze. We have also learnt about both sync and async generators piping, a powerful pattern that I’m sure will expand your horizons.

If you’ve enjoyed this journey through JavaScript Iterators and Generators, please help me share it to as many JS developers as possible 💪🏻😃. And, as usual, I hope to see you again 🙂 and on twitter!

Acknowledgements Link to heading

I would like to thank Nicolò Ribaudo and Marco Iamonte for the time spent helping me to improve the quality of the article.

Bibliography Link to heading

ECMAScript 2019 specification
Exploring JS series by Dr. Axel Rauschmayer
Reading streams via async iteration in Node.js by Dr. Axel Rauschmayer
General Theory of Reactivity by Kris Kowal
Asynchronous Iterators for JavaScript by TC39