Featured image

Series: JavaScript Iterators and Generators

Introduction Link to heading

Fresh news of the JavaScript language (we are talking about ES2018), Asynchronous Iterators, and the corresponding Asynchronous Generators, landed to solve a subtle, but important, problem.

We have seen that each iteration step performed with a synchronous iterator returns the {done, value} object, which is called iterator result, where the done field is a boolean flag indicating whether the end of the iteration has been reached. Therefore, they are perfect for synchronous data sources. What is meant by synchronous source? They are data sources that, when receiving the request for the next element, are immediately able to determine whether that specific data will be the last available or not.

Pay attention to this detail: the source is synchronous, not the data, which can be synchronous or asynchronous. Quick demonstration with an array of numbers and Promises:

const array = [
    new Promise(res => setTimeout(res, 1000, 1)),
    new Promise(res => setTimeout(res, 2000, 2)),
    new Promise(res => setTimeout(res, 3000, 3)),
    new Promise(res => setTimeout(res, 4000, 4)),
    5,
    6,
    7,
    8,
]


;(async () => {

    for(const v of array) {
        console.log(await v); // 1 2 3 4 5 6 7 8
    }

})();

The array provides a synchronous iterator that is enough because, even if the returned value may be asynchronous, the collection is always able to determine synchronously (at the time of each request) its own state. Each time the next method is implicitly called by the for-of loop, the source knows if the element it is going to return will be the last, so it is able to immediately set the done field.

Note that this is possible only if the collection is completely present in memory. But you will know better than me that it’s common to have to interface with external data sources. They are usually represented by an entity that exposes an asynchronous API based on the concept of event or, thanks to some layers of abstraction, with the concept of stream. Unfortunately, the synchronous iterators cannot be used to interface with them, because this type of iterator forces us to determine in a synchronous way the end of the iteration. Those entities do not contain the data, perhaps they contain a small part but not all of it, because it is often not physically possible, so they are not able to know if the next data will be the last when it is requested.

This is where asynchronous iteration comes into play, for which the resolution of the done flag is asynchronous.  

Asynchronous Iteration Link to heading

This type of iteration is based on two new interfaces.  

The AsyncIterable Link to heading

The AsyncIterable interface defines what an entity has to implement to be considered an async iterable. The specification says that the @@asyncIterator method, which returns an object that implements the AsyncIterator interface is required. What is @@asyncIterator? Is a specific Symbol, like @@iterator was, and we can find it into the Symbol constructor: Symbol.asyncIterator.

const AsyncIterable = {
    [Symbol.asyncIterator]() {
        return AsyncIterator;
    }
}

 

The AsyncIterator Link to heading

The big difference between an AsyncIterator and a sync one is what the three methods (next, return, throw) should return: a Promise. Their purpose basically remained the same. Usually, both next and return methods should return a promise that is going to fulfil with an IteratorResult object. On the contrary, the promise returned by the throw method should be a rejected one, with the value passed as the argument being the rejecting reason.

const AsyncIterator = {
    next() {
        return Promise.resolve(IteratorResult);
    },
    return() {
        return Promise.resolve(IteratorResult);
    },
    throw(e) {
        return Promise.reject(e);
    }
}

 

The IteratorResult Link to heading

There isn’t an async counterpart for this interface: the old IteratorResult has everything we need to identify each iteration result. Indeed, we simply wrap it inside a Promise to be able to resolve the done flag asynchronously. The only thing to keep in mind is a limitation which concerns the value field: it should never be neither a promise nor a thenable. This approach would dangerously resemble a promise (the one returned by the AsyncIterator methods) of a promise (the one inside the value field of the IteratorResult), a concept from which the JavaScript has always kept very far. On the other hand, always finding a spatial value inside the fulfilled IteratorResult will ensure greater temporal consistency between iterations.  

Asynchronous Iterators Link to heading

Let’s implement an async iterators factory to iterate over a remote API:

function remotePostsAsyncIteratorsFactory() {
    let i = 1;
    let done = false;

    const asyncIterableIterator = {
        // the next method will always return a Promise
        async next() {

            // do nothing if we went out-of-bounds
            if (done) {
                return Promise.resolve({
                    done: true,
                    value: undefined
                });
            }

            const res = await fetch(`https://jsonplaceholder.typicode.com/posts/${i++}`)
                                .then(r => r.json());

            // the posts source is ended
            if (Object.keys(res).length === 0) {
                done = true;
                return Promise.resolve({
                    done: true,
                    value: undefined
                });
            } else {
                return Promise.resolve({
                    done: false,
                    value: res
                });
            };

        },
        [Symbol.asyncIterator]() {
            return this;
        }
    }

    return asyncIterableIterator;
}

I’m sure that you aren’t seeing anything you are not able to understand. The next method will always return a Promise, as the interface wants. The Promise will be fulfilled after data fetching, thanks to which we are able to know when the iteration is over. Note that I’ve added the @@asyncIterator method to the returned iterator. And that’s because all async iterators should be async iterables, following the example of their sync counterpart.

Let’s use it:

;(async() => {

    const ait = remotePostsAsyncIteratorsFactory();

    await ait.next(); // { done:false, value:{id: 1, ...} }
    await ait.next(); // { done:false, value:{id: 2, ...} }
    await ait.next(); // { done:false, value:{id: 3, ...} }
    // ...
    await ait.next(); // { done:false, value:{id: 100, ...} }
    await ait.next(); // { done:true, value:undefined }

})();

I think the code is sufficiently self-explanatory.  

The for-await-of loop Link to heading

The async counterpart of the for-of loop is the fow-await-of loop, which helps us a lot to iterate async sources without the need to manually handle each async IterationResult nor the async end of the iteration. It can be used only inside async contexts, like a yaffee, and is able to handle sync sources too. First of all, it will try to call the @@asyncIterator method to get an async iterator to iterate, but it will fall back on the @@iterator method when the source given to it is synchronous.

;(async function IIAFE() {

    for await (const v of source) {
        console.log(v);
    }
    
})();

 

Let’s see some examples to learn how this loop behaves:

    // sync source, sync values
    // each iteration will return '{ value:number|undefined, done:boolean }'
    for await (const v of [1, 2, 3, 4, ...]) {
        console.log(v); // 1 2 3 4 ...
    }

    // sync source, async values
    // each iteration will return '{ value:Promise<number>|undefined, done:boolean }'
    const array = [
        new Promise(res => setTimeout(res, 1000, 1)),
        new Promise(res => setTimeout(res, 2000, 2)),
        new Promise(res => setTimeout(res, 3000, 3)),
        new Promise(res => setTimeout(res, 4000, 4)),
        ...
    ]
    for await (const v of array) {
        console.log(v); // 1 2 3 4 ...
    }


    // async source, sync values
    // each iteration will return 'Promise<{ value:number|undefined, done:boolean }>'
    for await (const v of asyncSource) {
        console.log(v); // 1 2 3 4 ...
    }

    // async source, async values (BAD)
    // each iteration will return 'Promise<{ value:Promise<number|undefined>, done:boolean }>'
    for await (const v of asyncSource) {
        console.log(v); // series of Promises...
    }

Probably one or more results may sound strange to you, so let’s try to make things clearer.  

Async sources Link to heading

For async sources, the loop will just await each Promise returned by the implicit calls to the next method. When the Promise is fulfilled, if the done flag is false, the loop will make the value available inside its body, whatever it is, to then proceed with the following iteration at its end. No other operations will be performed on the value itself, and this explains why in the third example we see a series of numbers and in the fourth a series of Promises. Another good reason to not use Promises as values for async iterations! Instead, if the done flag is true, the loop will end.  

Sync sources Link to heading

The behaviour of the for-await-of loop for sync sources is slightly different from what one might expect. You could think that each IteratorResult object will be directly adapted, being inserted into an immediately fulfilled Promise, to eliminate any difference between sync and async iteration results. But, if this was the case, the outcome of the second example should be the same as the fourth one.

You are not very far from the truth, but things are slightly different. It’s the sync Iterator itself which is adapted thanks to the CreateAsyncFromSyncIterator abstract operation. What happens is that each iterated value is normalized into a Promise, via Promise.resolve, to then be “awaited” to produce the IteratorResult. We can outline what happens behind the scenes, at each iteration, in the following way, which I’ve derived from the Dr. Axel one:

// the for-await-of has just called 'adapter.next()' and is 'awaiting' the result

try {
    const syncIteratorResult = syncIterator.next();

    const nextIteratorResultPromise = Promise.resolve(syncIteratorResult.value)
        .then(value => ({ value, done: syncIteratorResult.done }));

    return nextIteratorResultPromise; // <-- this will be 'awaited' by the for-await-of
} catch(e) {
    // the loop is going to throw an exception if something goes wrong during the 'next' method call
    throw e;
}

Another great way to see what happens, which I’m going to borrow from Axel, is the following: Iterable<T> and Iterable<Promise<T>> become AsyncIterable<T>.  

Node.js Streams Link to heading

Node.js Readable Streams are a more concrete example of async iterables. That is because they were built to support consumers that are slower than producers, so they are able to interrupt the data stream whenever necessary. Usually all this goes unnoticed, well hidden by the pipe method:

readableStream.pipe(writableStream);

But we can explicitly pause the stream too, requesting chunks of data only when it’s our will:

const readableStreamAsyncIter = readableStream[Symbol.asyncIterator]();

await readableStreamAsyncIter.next(); // first chunk
// other async stuff
await readableStreamAsyncIter.next(); // second chunk

Readable Streams cannot implement the synchronous iteration interfaces because they interact asynchronously with external resources like files. The point is that it’s not a single, long interaction, but it is spreaded over time. That is because streams are not going to load the whole file, but only chunks of it, which flow to the consumer. Having a limited knowledge of the file itself, they are almost never able to solve synchronously the done flag.  

The consumer pressure problem Link to heading

Let’s consider a generic async source:

const ait = asyncSource[Symbol.asyncIterator]();

What will happen if we do like this?

ait.next().then(...);
ait.next().then(...);
ait.next().then(...);

Each call to the next method will cause the async source to start an async task to provide a result, but the main problem is that these tasks will run in parallel, not sequentially. That is because the async iterator was moved forward synchronously.

We could say that the consumer is putting too much pressure on the producer. Odds are that the latter is unable to deal with it because:

  1. Each async task could be the last, ending with a done:true. All the async tasks started after that shouldn’t do any work, ending as soon as possible with {value:undefined, done:true}. Unfortunately, if tasks were started concurrently, chances are that at a certain point some of them will be doing completely useless work, wasting resources and probably causing problems, because one of the others has completed the iteration. And most likely they will not even finish correctly reporting the out-of-bounds status.
  2. Leaving aside the end-of-iteration problem, let’s focus on the results. What if the async source, to compute each async task result, need the ending value of the previous one? For example, think about cursor-based pagination. Tasks can be started concurrently, so it’s impossible to create well-formed async iterables for these eventualities.

The truth is we need a way to force the iteration to be sequential, ensuring time consistency between both async and sync next, throw and return calls. Doing so, we’ll also avoid the unfortunate, conceptually wrong situation where one call to one of those iteration methods is going to finish before than a previous one.

Since we will never be able to prevent a consumer to mess with the iteration’s methods, we have to enqueue the calls to them with their respective arguments, if any. In this way, the async source will be able to properly handle them one after another. At this point, things get quite complicated, but we don’t have to worry about it because we have async generators, which have this feature out-of-the-box.

 

Conclusion Link to heading

That’s all you should know about Asynchronous Iterators!

We have learnt why the ability to resolve the done flag asynchronously could be vital in some circumstances. The fresh async iteration interfaces are here to help us reach the goal, and now you know all their main features and best practices.

Then we have seen an example of a simple async iterator, how the for-await-of loop behaves and why Node.js Readable Streams do support async iteration. We also have spent some words on a rarely considered but important problem that is very well resolved by the next, last big topic: Asynchronous Generators

I hope to see you there again 🙂 and on twitter!

 

Acknowledgements Link to heading

I would like to thank Marco Iamonte for the time he spent helping me to identify a lot of grammatical errors.

 

Bibliography Link to heading