Review:

  • JavaScript language-defined callbacks, promises, await, async
  • JavaScript/NodeJS engine (libuv)
  • OS IO multiplexing (epoll)

Understanding how async functions work and are implemented makes it easier to read and debug JavaScript code.

1. Introduction

For programmers accustomed to thread pools, multi-processes, and message queues, there are a few concepts to understand when using JavaScript/NodeJS:

  • Callbacks, Promises, Async Await
  • IO multiplexing and async access

2. Callbacks, Promises, Async Await

There’s a YouTube video Async JS Crash Course - Callbacks, Promises, Async Await that explains these concepts clearly in 24 minutes.

Callbacks are the most basic, but they have a problem: when writing async functions, you need to pass a callback function as a parameter. If that function itself is async and also has a callback parameter, and there are many layers, you get “callback hell” - the code becomes very hard to read.

To solve this, Promise adds a layer of abstraction. When defining an async function, you no longer need to specify the callback function. Instead, you can:

  • Use .then() to define callbacks for successful execution
  • Use .catch() to define callbacks for errors

Async Await is a further abstraction over Promise. It makes async functions look like synchronous functions (no more explicit .then() for callbacks).

Here’s an example from Google’s official documentation:

async function myFirstAsyncFunction() {
  try {
    const fulfilledValue = await promise;
  }
  catch (rejectedValue) {
    // …
  }
}

When you use the async keyword before a function definition, you can use await inside that function. When you await a Promise, the function pauses execution until the Promise resolves, and this pause doesn’t block the main thread. If the Promise resolves, it returns the value. If it rejects, it throws the rejection value.

When a JavaScript program executes, it first runs each statement synchronously line by line. When it encounters a callback, it puts that callback in a queue and continues to the next statement. After all statements are executed, it checks which callbacks in the queue are ready to run and executes them. See the NodeJS Event Loop section below.

doA(function() {
  
  doB();
  
  doC(function() {
    doD();
  });
  
  doE();
});

doF();

The execution order is: A, F, B, C, E, D. All of ABCDEF don’t depend on external events. When a function is a callback of an async function, its execution also depends on whether the corresponding event has occurred.

For example:

<p id="content"> Please wait three seconds!</p>  
<script>  
setTimeout("changeState()",3000 );  
function changeState(){  
    let content=document.getElementById('content');  
    content.innerHTML="<div style='color:red'>I appear after 3 seconds!</div>";  
}  
</script>

The changeState function above only executes after the timer fires. See the Timers section in the event loop description below.

The callback is configured through setTimeout, which is a built-in async function of the JS engine. Calling it returns immediately, but the callback will be invoked when the specific event occurs.

A JavaScript program has various callback functions, triggered through timers or network/file access functions provided by JS. The internal callback implementation uses IO multiplexing and async mechanisms.

3. IO Multiplexing and Async Access

IO multiplexing (event-driven IO) is not unique to JavaScript/NodeJS - it’s a general IO access pattern used by most high-performance web servers (e.g., nginx).

JavaScript is famous for async IO access because it only supports this IO method. It forces programmers to use this less intuitive but high-performance design pattern, which is great for IO-intensive applications.

Python also supports async IO, but due to historical reasons, most applications use multi-process/multi-thread approaches with synchronous IO processing within processes.

libevent, libev, libuv are third-party libraries for building IO multiplexing applications. Their main purpose is to shield differences in OS APIs and provide a unified interface for applications.

These libraries can support large concurrent connections in network server applications because they can delegate the scanning of thousands of IO states to the operating system with minimal memory consumption. For new network requests:

  • No new process/thread is needed, reducing resource consumption
  • No additional async polling is needed at the application layer - just use epoll_ctl to tell the OS

Both browser JS and backend NodeJS have async mechanisms. Fundamentally, they access OS-provided async IO APIs within the OS process (Chrome process or NodeJS process). Let’s analyze primarily with NodeJS as the example.

NodeJS uses libuv to implement IO multiplexing and async IO. libuv started as part of the NodeJS project but has been adopted by other projects, for example Python’s Fastapi also uses libuv.

Note: UNP: Unix Network Programming divides IO access into 5 categories:

  • blocking I/O
  • nonblocking I/O
  • I/O multiplexing (select and poll)
  • signal driven I/O (SIGIO)
  • asynchronous I/O (the POSIX aio_ functions)

The third category uses epoll, which is an enhanced version of select/poll.

The fifth category in UNP refers to async IO where the OS handles data reading/writing and the application only cares about completion events (similar to DMA).

This differs from JavaScript’s async IO, which more precisely refers to non-blocking IO under IO multiplexing.

3.1. NodeJS Event Loop

picture 2

  • timers: this phase executes callbacks scheduled by setTimeout() and setInterval().
  • pending callbacks: executes I/O callbacks deferred to the next loop iteration.
  • idle, prepare: only used internally.
  • poll: retrieve new I/O events; execute I/O related callbacks (almost all with the exception of close callbacks, the ones scheduled by timers, and setImmediate()); node will block here when appropriate.
  • check: setImmediate() callbacks are invoked here.
  • close callbacks: some close callbacks, e.g. socket.on(‘close’, …).

As shown in the NodeJS official documentation, the event loop’s main thread continuously polls (on Linux, through libuv calling epoll_wait) to check which events have occurred, then calls the corresponding user-registered callbacks.

3.2. libuv

The libuv documentation introduces it as:

Another important dependency is libuv, a C library that is used to abstract non-blocking I/O operations to a consistent interface across all supported platforms. It provides mechanisms to handle file system, DNS, network, child processes, pipes, signal handling, polling and streaming. It also includes a thread pool for offloading work for some things that can’t be done asynchronously at the operating system level.

picture 1

As you can see in libuv:

  • Network IO: uses OS-provided async IO implementations
  • File operations: uses thread pools to provide async event interfaces

3.3. OS Calls in Network IO

For network data sending/receiving, libuv wraps different operating systems. On Linux, it uses the async IO API epoll, which provides three APIs:

int epoll_create(int size);
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);

Applications configure which events to listen for via epoll_ctl, and query which events have occurred via epoll_wait.

Note:

  • epoll_ctl can configure events like file descriptor readability, writability, etc.
  • epoll_wait may not block - the timeout parameter can be very short to check events and return immediately. NodeJS works this way. If it blocked forever, the event loop wouldn’t turn.

4. Summary

JavaScript uses async methods to access IO. For code readability, it has made several abstractions on top of callbacks. Async Await now looks very similar to synchronous functions.

JavaScript’s async events are supported by the underlying operating system. Events are triggered by the OS and read by JavaScript’s internal event loop, which then calls the corresponding event handler callbacks.

JavaScript code cannot independently generate events. JavaScript async functions are implementations of callbacks required by engine APIs, and further abstractions over those engine APIs.