Architecture of Node.js’ Internal Codebase First off, some words about JavaScript……
Jeff Atwood, co-founder of Stack Overflow, once wrote in his famous programming blog Coding Horror: “any application that can be written in JavaScript, will eventually be written in JavaScript”
The influence and reach of JavaScript has extended so much in the past few years that it has become one of the most popular programming languages. Indeed, in the 2016 SO Developer Survey, JavaScript ranked #1 on the Most Popular Technology and Top Tech on Stack Overflow, and ranked well on several other results as well.
Node.js is a server-side JavaScript environment that lays the foundation for critical server-side functionalities such as binary data manipulation, file system I/O operations, database access, computer networking, etc. It has unique features that make it stand out among competitions with existing frameworks that were tried and tested, such as Django (Python), Laravel (PHP), RoR (Ruby), etc. Arguably, it was those features that led tech companies such as PayPal, Tinder, Medium (yes, this very blog system), LinkedIn, and Netflix to adopt it, some even before Node.js has reached version 1.0. I’ve recently answered a Stack Overflow question regarding the architecture of Node.js’ internal codebase, which inspired me to write this article.
The official doc isn’t very helpful at explaining what Node.js is: a JavaScript runtime built on Chrome’s V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model .…… In order to understand this statement and the actual power behind it, let’s break down Node.js’ components, elaborate on some key terms, and then explain how different pieces interact with one another to make Node.js the powerful runtime it is:
COMPONENTS/DEPENDENCIES V8: the high-performance JavaScript engine open sourced by Google and implemented in C++. This is the same engine that resides in your Chrome browser. V8 takes the code you write in JavaScript, compiles it into machine code (therefore blazing fast), and executes it. Just how fast is V8? Check out this SO answer.
libuv: the C library that provides asynchronous features. It maintains event loops, a thread pool, file system I/O, DNS functionalities, and network I/O, among other critical functionalities.
Other C/C++ Components/Dependencies: such as c-ares, crypto (OpenSSL), http-parser, and zlib. These dependencies provide low-level interactions with servers to establish important functionalities such as networking, compressing, encrypting, etc.
Application/Modules: this is where all the JavaScript code lives: your application code, Node.js core modules, any modules you install from npm, and any modules that you write yourself. This is the part that you will be working on most of the time.
Bindings: you probably have noticed by this time that Node.js is written in both JavaScript and C/C++. The reason that there are so many C/C++ code/libraries is simple: they are fast. However, how is it possible that the code you write in JavaScript end up communicating smoothly with code written in C/C++? Aren’t they three difference programming languages? Yes they are. And normally code written in different languages cannot be communicate with each other. Not without bindings. Bindings, as the name implies, are glue codes that “bind” one language with another so that they can talk with each other. In this case (Node.js), bindings simply expose core Node.js internal libraries written in C/C++ (c-ares, zlib, OpenSSL, http-parser, etc.) to JavaScript. One motivation behind writing bindings is code reuse: if a desired functionality is already implemented, why write the entire thing again, just because they are in different languages? Why not just bridge them? Another motivation is performance: system programming languages such as C/C++ are generally much faster than other high-level languages (e.g. Python, JavaScript, Ruby, etc.). Therefore it might be wise to designate CPU-intensive operations to codes written in C/C++, for example.
C/C++ Addons: Bindings only provide glue code for Node.js’ core internal libraries, i.e. zlib, OpenSSL, c-ares, http-parser, etc. If you want to include a third-party or your own C/C++ library in your application, you would have to write the glue code for that library yourself. These glue code you write are called addons. Think of bindings and addons as bridges between your JavaScript code and Node.js’ C/C++ code.
TERMINOLOGIES I/O: shorthand for Input/Output. It basically denotes any computer operations handled primarily by the system’s I/O subsystem. I/O-bound operations usually involves interactions with disks/drives. Examples include database access and file system operations. Some other related concepts include CPU-bound, memory-bound, etc. A good way to determine whether an operation belongs to I/O-bound, CPU-bound, or others is to check by increasing which resource would that specific operation have a better performance. For example, if an operation would go noticeably faster if CPU power is increased, then it is CPU-bound.
Non-blocking/Asynchronous: Normally, when a request comes in, the application would handle the request and halt all other operations until the request is processed. This immediately presents a problem: when a large number of requests come in at the same time, each request would have to wait until the previous requests are processed. In other words, the previous operation will block the ones following it. To make it worse, if the previous requests have long response time (e.g. calculating the first 1000 prime numbers, or reading 3GB of data from database), all other requests would be halted/blocked for a long time. To address this issue, one can resort to multiprocessing and/or multithreading as solutions, each with its own pros and cons. Node.js handles things differently. Instead of spawning a new thread for every new request, all the requests are handled on one single main thread, and that’s pretty much all that it does: handle requests — all (I/O) operations contained in the request, such as file system access, database read/write, are sent to the worker threads maintained by libuv (mentioned above) in the background. In other words, I/O operations in the requests are handled asynchronously, not on the main thread. This way the main thread is never blocked, as heavyliftings are shipped elsewhere. You (and thus your application code) ever only get to work with the one and only main thread. All the worker threads in libuv’s tread pool are shielded from you. You never get to work directly with (or need to worry about) them. Node.js takes care of them for you. This architecture makes I/O operations especially efficient. However, it’s not without disadvantages. Operations include not only I/O-bound ones, but CPU-bound, memory-bound, etc. as well. Out of box, Node.js only provides asynchronous functions for I/O tasks. There are ways to work around CPU-intensive operations. However, they are not the focus of this article.
Event-Driven: typically, in almost all modern systems, after the main application kicks off, processes are initiated by incoming requests. However, how things go from there differ, sometimes drastically, among different technologies. Typical implementations handle a request procedurally: a thread is spawn for a request; operations are done one after another; if an operation is slow, all following operations halts on that thread; when all operations complete successfully, a response to returned. However, in Node.js, all operations are registered to Node.js as events, waiting to be triggered, either by the main application or requests.
Runtime (System): Node.js runtime is the entire codebase (components mentioned above), both low-level and high-level, that together supports the execution of a Node.js application.
PUTTING EVERYTHING TOGETHER Now that we have a high-level overview of Node.js’ components, we’ll investigate its workflow to get a better sense of its architecture and how different components interact with one another.
When a Node.js application starts running, the V8 engine will run the application code you write. Objects in your application will keep a list of observers (functions registered to events). These observers will get notified when their respective events are emitted.
When an event is emitted, its callback function will be enqueued into an event queue. As long as there are remaining events in the queue, the event loop will keep dequeuing events in the queue and putting them onto the call stack. It should be noted that only when the previous event is processed (the call stack is cleared) will the event loop put the next event onto the call stack.
On the call stack, when an I/O operation is encountered, it will be handed over to libuv for processing. By default, libuv maintains a thread pool of four worker threads, although the number can be altered to add more threads. If a request is file-system I/O and DNS-related, then it will be assigned to the thread pool for processing; otherwise, for other requests such as networking, platform-specific mechanisms will be deployed to handle such requests (libuv design overview).
For I/O operations that make use of the thread pool (i.e. file I/O, DNS, etc.), the worker threads will interact with Node.js’ low-level libraries to perform operations such as database transaction, file system access, etc. When the processing is over, libuv will enqueue the event back into the event queue again for the main tread to work on. During the time that libuv handles asynchronous I/O operations, the main thread does not wait for the outcome of processing, but moves on instead. The event returned by libuv will have the opportunity to be handled by the main thread again when it is put back onto the call stack by the event loop.This completes a the life cycle of an event in a Node.js application.
mbq once made an excellent analogy between Node.js and a restaurant. I will borrow and modify his example to make the Node.js cycle easier to understand: Think of a Node.js application as a Starbucks cafe. A highly efficient and well-trained waiter (the one and only main thread) will take order. When a large number of customers visits the cafe at the same time, they will wait in line (enqueued in the event queue) to be served by the waiter. Once a customer is served by the waiter, the waiter passes the order to a manager (libuv), who assigns each order to a barista (worker thread or platform-specific mechanism). The barista will use different ingredients and machines (low-level C/C++ components) to make different kinds of drinks, depending on the customers’ requests. Typically there will be four baristas on duty (tread pool) to specifically make latte (file I/O, DNS, etc.). However, when the peak hits, more baristas can be called to work (however this should be done at the beginning of the day, NOT during lunch break for example). Once the waiter passes the order to the manager, he does not wait for the coffee to be made to serve another customer. Instead, he calls the next customer (the next event dequeued by the event loop and pushed onto the call stack). You can think of an event currently on the call stack as a customer at the counter being served. When the coffee is done, the coffee will be sent to the end of the customer line. The waiter will call out the name when the coffee moves to the counter, and the customer will get his coffee. (The underlined part of the analogy sounds weird in real life; however when you consider the process from a program’s perspective, it makes sense)
This completes the high-level overview of Node.js’ internal codebase and its events’ typical life cycle. This overview, however, is very general and did not address many issues and details, for example CPU-bound operation handling, Node.js design patterns, etc. More topics will be covered in other articles.
Edit: 11:32, Jun 13, 2016: revised the article to reflect more accurate design and workflow of libuv as pointed out by Andrew Johnston in his response made on Jun 11, 2016.